Category: Technology
Objective:
Describe the rationale and process for acquiring speech data to contribute to increasing accessibility to speech recognition for patients with PD.
Background:
Disordered speech and voice may limit access to everyday voice activated devices (e.g., mobile phones, computers). These devices have automatic speech recognition (ASR) systems that have been trained on non-disordered speech. This makes it frustrating for individuals with speech disorders such as those accompanying PD to utilize these devices. Project Euphonia is an initiative by Google to make speech technology more accessible to individuals with non-standard speech. The first step in this process is to gather large numbers of speech samples from disordered speakers in order to train speech recognition systems. This paper reports initial work to collect speech data from individuals with PD in collaboration with a team including research scientists, software engineers, experts in speech recognition algorithms, designers and product managers.
Method: The first step in this process is to teach speech recognition algorithms to understand disordered speech. While there is a vast literature on automatic speech recognition algorithms, to teach these algorithms to understand disordered speech, they need sufficient speech samples from disordered speakers. Because of our over twenty years of research on speech and voice in PD (e.g., Ramig et al., 1995; Ramig et al., 2001a, 2001b; Ramig et al, 2018), including years of gathering acoustic data on patients with PD as well as having access to a large PD community, our research team at LSVT Global was invited to collaborate on Project Euphonia.
Results: After a series of pilot studies, procedures were established to optimize successful data collection supported by speech therapy mentors. Screening procedures were established for technology, cognitive, motor challenges and potential home support. Recruiting was expanded to include eight major Parkinson’s disease organizations. By completion of the final pilot study, 75,356 phrases were collected from patients with PD, MSA, CBD and PSP. Outcomes of the speech recognition data analysis will be reported.
Conclusion: A feasible data collection procedure has been established and the project will scale up to include larger numbers of patients, disorders and dialects.
Portions of this project were presented as part of a Virtual ASHA Mini Convention, November, 2020.
References: Ramig, L. O., Countryman, S., Thompson, L. L., Horii, Y. (1995). A comparison of two forms of intensive speech treatment for Parkinson’s disease. Journal of Speech and Hearing Research, 38, 1232-1251. Ramig, L. O., Sapir, S., Countryman, S., et al. (2001a). Intensive voice treatment (LSVT®) for individuals with Parkinson’s disease: a two-year follow-up. Journal of Neurology, Neurosurgery, and Psychiatry, 71, 493-498. Ramig, L. O., Sapir, S., Fox, C. M., & Countryman, S. (2001b). Changes in vocal loudness following intensive voice treatment (LSVT®) in individuals with Parkinson’s disease: a comparison with untreated patients and normal age-matched controls. Movement Disorders, 16, 79-83. Ramig, L., Halpern, A., Spielman, J., Fox, C., & Freeman, K. (2018). Speech treatment in Parkinson’s disease: Randomized controlled trial (RCT). Movement Disorders, 33(11), 1777-1791.
To cite this abstract in AMA style:
L. Ramig, R. Macdonald, P. Jiang, H. Hodges, J. Spielman, O. Reed, E. Nauman, C. Bergey, J. Cattiau. Building a Data Base for Automatic Speech Recognition in Parkinson’s disease (PD) [abstract]. Mov Disord. 2021; 36 (suppl 1). https://www.mdsabstracts.org/abstract/building-a-data-base-for-automatic-speech-recognition-in-parkinsons-disease-pd/. Accessed November 22, 2024.« Back to MDS Virtual Congress 2021
MDS Abstracts - https://www.mdsabstracts.org/abstract/building-a-data-base-for-automatic-speech-recognition-in-parkinsons-disease-pd/