TY - JOUR
T1 - A hybrid machine learning approach for prediction of conversion from mild cognitive impairment to dementia
AU - Bucholc, Magda
AU - Titarenko, Sofya
AU - Ding, Xuemei
AU - Canavan, Callum
AU - Chen, Tianhua
PY - 2023/5/1
Y1 - 2023/5/1
N2 - Mild cognitive impairment (MCI) represents a precursor to dementia for many individuals; however, some forms of MCI tend to remain stable over time and do not progress to dementia. In fact, conversion rates vary substantially depending on the diagnostic criteria used and the nature of the analytic sample and clinical setting. To identify personalized strategies to prevent or slow the progression of dementia and to support the clinical development of novel treatments, we need to develop new approaches for modelling disease progression that can differentiate between progressive and non-progressive MCI subjects. The aim of this study was to develop a novel prognostic machine learning (ML) framework utilising longitudinal information encoded in efficient, cost-effective, and non-invasive markers to identify MCI subjects that are at risk for developing dementia. Our approach was developed using the dataset from the National Alzheimer's Coordinating Center. We built two prognostic models based on the patient data from 3 (n = 768) (Model 1) and 4 (n = 409) (Model 2) assessment visits. A novel hybrid prognostic approach, using cognitive trajectory classes, generated through unsupervised learning (Stage 1), as input in supervised ML models (Stage 2), was developed and systematically tested. Our unsupervised learning approach (Stage 1) involved: (i) the implementation of the longitudinal data partitioning method allowing for clustering trajectories based on their shapes; (ii) validation of the optimal number of clusters using three different Clustering Validity Indices (CVIs), and (iii) application of the fusion-based methods for combining CVIs into the fused normalized CVI scores, averaged for each cluster partition to determine the final number of trajectory classes for each type of clinical scores. In Stage 2, we built four types of prognostic models based on random forest (RF), Support Vector Machines (SVM), logistic regression (LR), and kNN ensemble approaches. Classification models incorporating both clinical scores and cognitive trajectory classes input showed up to 6.5 % higher accuracy than models based only on clinical scores (p < 0.05 in all cases). Given the patient data from three time points (Model 1), the highest recorded prediction accuracy was achieved for the ensemble and RF model, i.e., 85.0 % (standard deviation: 3.1 %) and 84.6 % (4.1 %) respectively. Using the patient data from four time points (Model 2), the highest accuracy was reported for RF and ensemble models, i.e., 87.5 % (6.1 %) and 86.8 % (3.7 %) respectively. We showed that the incorporation of the output of unsupervised learning significantly improved the performance of supervised ML models. Our prognostic framework can be applied to improve recruitment in clinical trials and to select early interventions for individuals at high risk of developing dementia.
AB - Mild cognitive impairment (MCI) represents a precursor to dementia for many individuals; however, some forms of MCI tend to remain stable over time and do not progress to dementia. In fact, conversion rates vary substantially depending on the diagnostic criteria used and the nature of the analytic sample and clinical setting. To identify personalized strategies to prevent or slow the progression of dementia and to support the clinical development of novel treatments, we need to develop new approaches for modelling disease progression that can differentiate between progressive and non-progressive MCI subjects. The aim of this study was to develop a novel prognostic machine learning (ML) framework utilising longitudinal information encoded in efficient, cost-effective, and non-invasive markers to identify MCI subjects that are at risk for developing dementia. Our approach was developed using the dataset from the National Alzheimer's Coordinating Center. We built two prognostic models based on the patient data from 3 (n = 768) (Model 1) and 4 (n = 409) (Model 2) assessment visits. A novel hybrid prognostic approach, using cognitive trajectory classes, generated through unsupervised learning (Stage 1), as input in supervised ML models (Stage 2), was developed and systematically tested. Our unsupervised learning approach (Stage 1) involved: (i) the implementation of the longitudinal data partitioning method allowing for clustering trajectories based on their shapes; (ii) validation of the optimal number of clusters using three different Clustering Validity Indices (CVIs), and (iii) application of the fusion-based methods for combining CVIs into the fused normalized CVI scores, averaged for each cluster partition to determine the final number of trajectory classes for each type of clinical scores. In Stage 2, we built four types of prognostic models based on random forest (RF), Support Vector Machines (SVM), logistic regression (LR), and kNN ensemble approaches. Classification models incorporating both clinical scores and cognitive trajectory classes input showed up to 6.5 % higher accuracy than models based only on clinical scores (p < 0.05 in all cases). Given the patient data from three time points (Model 1), the highest recorded prediction accuracy was achieved for the ensemble and RF model, i.e., 85.0 % (standard deviation: 3.1 %) and 84.6 % (4.1 %) respectively. Using the patient data from four time points (Model 2), the highest accuracy was reported for RF and ensemble models, i.e., 87.5 % (6.1 %) and 86.8 % (3.7 %) respectively. We showed that the incorporation of the output of unsupervised learning significantly improved the performance of supervised ML models. Our prognostic framework can be applied to improve recruitment in clinical trials and to select early interventions for individuals at high risk of developing dementia.
KW - dementia
KW - mild cognitive impairment
KW - machine learning
KW - longitudinal modelling
KW - unsupervised learning
KW - prognostic model
UR - http://www.scopus.com/inward/record.url?scp=85146554818&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2023.119541
DO - 10.1016/j.eswa.2023.119541
M3 - Article
VL - 217
JO - Expert Systems with Applications
JF - Expert Systems with Applications
SN - 0957-4174
M1 - 119541
ER -