Objective: The aim of this project is to analyze available data of voice, Tappy keystroke, spiral drawings, and gait sample data involving PD patients and controls that are made available by researchers in pubic databases using machine learning models and identify early PD characteristics that are more pronounced than others
Background: Parkinson’s disease (PD) affects approximately 6 million people worldwide. Data analysis of voice, Tappy keystroke, spiral drawings, or gait data using machine learning (ML) models may provide an inexpensive, non-invasive, and simple method for remote diagnosis of PD before the motor signs manifest.
Method: An ML model was developed using Random Forest to analyze existing clinical data for PD and non-PD patients (healthy controls). We reviewed UCI (https://archive.ics.uci.edu/), PPMI (https://www.ppmi-info.org/), and Kaggle (www.Kaggle.com) databases. ML analysis was carried out on voice samples in PD and in REM sleep behavior disorder, Tappy keystroke, spiral drawings, and gait data sets from Kaggle database. Fig.1
Results: ML analysis of voice data revealed Accuracy 88.72 %, Precision 90.86 %, Recall 95.22 %, and F1 score 92.77%. Fig.2
Analysis of Tappy key stroke data revealed Accuracy 72.79 %, Precision 76.50 %, Recall (sensitivity) 93.08 %, and F1 score 83.97 %. Fig.3
Analysis of Spiral drawing data revealed Accuracy 70.97 %, Precision 74.09 %, Recall 82.89 %, and F1 score 77.90 %. Fig.4
Analysis of Gait data revealed Accuracy 63.83 %, Precision 67.90 %, Recall 74.81 %, and F1 score 70.01 %. Fig.5
Analysis of voice data in RBD revealed Accuracy 70.00%, Precision 72.00%, Recall 70.00%, and F1 score 69.00 %. Fig.6
Conclusion: The ML prediction model developed may be helpful to improve risk prediction in PD for early intervention and resource prioritization. An ML model based on the Random Forest algorithm was developed to analyze various PD characteristics before clinical diagnosis of PD. The current study suggests that voice analysis is the most robust test followed by Tappy keystroke, spiral drawings, and gait analysis in that order. Voice is affected even in RBD patients revealing that voice is a sensitive and early measure of prodromal PD. Low accuracy of the analysis indicates that several PD positive samples would be undetected and unclassified. Combining all four features such as voice analysis, Tappy keystroke, spiral drawings, and gait analysis may improve accuracy. Fig.7
References: 1. Adams, W. R. High-accuracy detection of early Parkinson’s Disease using multiple characteristics of finger movement while typing. PLOS ONE, 2017, 12(11), e0188226.
2. Little, M.A., McSharry P.E., Hunter E.J., Spielman J, Ramig L.O. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans Biomed Eng., 2009, 56(4), 1015.
3. Isenkul, M.E., Sakar, B.E., Kursun, O. Improved spiral test using digitized graphics tablet for monitoring Parkinson’s disease. The 2nd International Conference on e-Health and Telemedicine (ICEHTM-2014), 2014, pp. 171-175.
4. Yogev, G., Giladi, N., Peretz, C., Springer, S., Simon, E.S., Hausdorff, J.M. Dual tasking, gait rhythmicity, and Parkinson’s disease: Which aspects of gait are attention demanding? Eur J Neuroscience, 2005, 22, 1248-1256.
To cite this abstract in AMA style:
A. Vaish. A Machine Learning Approach for Early Identification of Prodromal Parkinson’s Disease [abstract]. Mov Disord. 2024; 39 (suppl 1). https://www.mdsabstracts.org/abstract/a-machine-learning-approach-for-early-identification-of-prodromal-parkinsons-disease/. Accessed November 21, 2024.« Back to 2024 International Congress
MDS Abstracts - https://www.mdsabstracts.org/abstract/a-machine-learning-approach-for-early-identification-of-prodromal-parkinsons-disease/