Advancing Interpretability in Parkinson’s Disease Voice Analysis: A Multilingual AI Approach

M. Wójcik-Pędziwiatr, M. Zakrzewski, M. Wodziński, J. Orozco-Arroyave, E. Nöth, M. Rudzińska-Bar, D. Sztaho, T. Rumezak, D. Hemmerling (Kraków, Poland)

Meeting: 2024 International Congress

Abstract Number: 1207

Keywords: Parkinson’s

Category: Technology

Objective: This study aims to advance the interpretability of machine learning (ML) models in the detection of Parkinson’s Disease (PD) through a novel approach analyzing multilingual vocal phonations.

Background: Accurate and early detection of PD remains a significant challenge in neurology. Traditional ML models offer promising avenues for diagnosis through vocal analysis but often lack interpretability, a crucial component for clinical acceptance and understanding. Our research addresses this gap by integrating Explainable Artificial Intelligence (XAI) techniques with Vision Transformer (ViT) models, leveraging the rich, diverse data from multilingual voice recordings.

Method: We adopted a ViT architecture for its superior capability in handling complex patterns within spectrogram images of vocal phonations (Fig. 1). The study involved a multilingual dataset, including thousands of voice samples across several languages (Tabl. 1), processed into mel-spectrograms and regular spectrograms. We further applied XAI techniques to elucidate the model’s decision-making process, focusing on feature importance and model behavior explanation.

Results: Figure presents exemplary mel-spectrograms with XAI results for healthy individuals (A-D) and individuals with Parkinson’s disease (E-G): A&E – Spanish language vowel /a/, B&F – Polish language vowel /e/, C&G – Italian language vowel /e/, D&H – Hungarian language vowel /u/. The ViT model achieved a classification accuracy of 89% in identifying PD from non-PD voice samples, surpassing previous benchmarks. Through XAI techniques, we were able to demonstrate the model’s reliance on specific vocal features that are clinically relevant to PD symptoms. These insights not only bolster the model’s diagnostic value but also provide a pathway to more transparent AI tools in healthcare.

Conclusion: Our research underlines the potential of combining ViT and XAI for PD detection, offering a novel, interpretable, and accurate diagnostic tool. This approach not only enhances model transparency but also sets a precedent for future AI applications in medical diagnostics, particularly in conditions requiring nuanced interpretation of bioacoustic signals.

Tabl.1 Speakers and recordings number.

Figure 1. The experimental pipeline.

Fig. 2. Exemplary mel-spectrograms.

To cite this abstract in AMA style:

M. Wójcik-Pędziwiatr, M. Zakrzewski, M. Wodziński, J. Orozco-Arroyave, E. Nöth, M. Rudzińska-Bar, D. Sztaho, T. Rumezak, D. Hemmerling. Advancing Interpretability in Parkinson’s Disease Voice Analysis: A Multilingual AI Approach [abstract]. Mov Disord. 2024; 39 (suppl 1). https://www.mdsabstracts.org/abstract/advancing-interpretability-in-parkinsons-disease-voice-analysis-a-multilingual-ai-approach/. Accessed October 29, 2025.

« Back to 2024 International Congress

MDS Abstracts - https://www.mdsabstracts.org/abstract/advancing-interpretability-in-parkinsons-disease-voice-analysis-a-multilingual-ai-approach/