Alzheimer's disease progression detection model based on an early fusion of cost-effective multimodal data

Shaker El-Sappagh, Hager Saleh, Radhya Sahal, Tamer Abuhmed, S. M.Riazul Islam, Farman Ali, Eslam Amer

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)

Abstract

Alzheimer's disease (AD) is a severe neurodegenerative disease. The identification of patients at high risk of conversion from mild cognitive impairment to AD via earlier close monitoring, targeted investigations, and appropriate management is crucial. Recently, several machine learning (ML) algorithms have been used for AD progression detection. Most of these studies only utilized neuroimaging data from baseline visits. However, AD is a complex chronic disease, and usually, a medical expert will analyze the patient's whole history when making a progression diagnosis. Furthermore, neuroimaging data are always either limited or not available, especially in developing countries, due to their cost. In this paper, we compare the performance of five widely used ML algorithms, namely, the support vector machine, random forest, k-nearest neighbor, logistic regression, and decision tree to predict AD progression with a prediction horizon of 2.5 years. We use 1029 subjects from the Alzheimer's disease neuroimaging initiative (ADNI) database. In contrast to previous literature, our models are optimized using a collection of cost-effective time-series features including patient's comorbidities, cognitive scores, medication history, and demographics. Medication and comorbidity text data are semantically prepared. Drug terms are collected and cleaned before encoding using the therapeutic chemical classification (ATC) ontology, and then semantically aggregated to the appropriate level of granularity using ATC to ensure a less sparse dataset. Our experiments assert that the early fusion of comorbidity and medication features with other features reveals significant predictive power with all models. The random forest model achieves the most accurate performance compared to other models. This study is the first of its kind to investigate the role of such multimodal time-series data on AD prediction.

Original languageEnglish
Pages (from-to)680-699
Number of pages20
JournalFuture Generation Computer Systems
Volume115
Early online date15 Oct 2020
DOIs
Publication statusPublished - 1 Feb 2021
Externally publishedYes

Cite this