Category: Technology
Objective: To create a conversational chat agent that can provide persons with Parkinson’s with information about their disease.
Background: Artificial intelligence using Large Language Models (LLMs) has shown remarkable abilities in engaging in natural language conversations across various domains. While LLM agents have found applications in numerous fields, a significant concern is the potential generation of misinformation, often referred to as hallucinations. This risk is particularly relevant in the medical domain, where accurate information is critical. To tackle this challenge, we leveraged recent technological advancements for customizing general-purpose LLMs to specific domains, thereby mitigating the risk that a conversational agent will provide inaccurate information.
Method: We have tailored a Retrieval Augmented Generation (RAG) pipeline using the GPT-4 foundation model in combination with an expert driven knowledge base to create a conversational chat agent tailored specifically to Parkinson’s disease (PD-GPT). The primary purpose of this chat agent is to provide patients with precise and up-to-date information about Parkinson’s disease, similar to information that is typically provided by patient advocacy organizations. Importantly, our model is intentionally designed to refrain from offering specific medical advice to users. By customizing the LLM for this specific purpose, we aimed to enhance the accuracy, relevance, and breadth of the conversational agent’s responses, ensuring that the information it provides is both trustworthy and relevant, while maintaining its natural conversational flow.
Results: We present the methodology employed in the development of PD-GPT. We have curated a question bank comprising Parkinson’s specific ‘exam’ questions that we can utilize for automated and continuous model evaluation. A preliminary benchmark of PD-GPT using Semantic Answer Similarity [1] and Self Evaluation [2] metrics showed that for general level questions its output quality was on par with the regular ChatGPT-4 chatbot (public version 2024-Feb-28). In contrast, for detailed questions about Parkinson’s disease, PD-GPT outperformed ChatGPT-4 (see [figure1]), with a Self Evaluation score that was more than twice as high for PD-GPT.
Conclusion: Our preliminary results suggest that a tailored large language model (LLM) is both effective and advantageous in generating informative and coherent responses within the Parkinson’s disease domain.
References: [1] Risch, J., Möller, T., Gutsch, J., & Pietsch, M. (2021). Semantic Answer Similarity for Evaluating Question Answering Models. arXiv [Cs.CL]. http://arxiv.org/abs/2108.06130
[2] Ren, J., Zhao, Y., Vu, T., Liu, P. J., & Lakshminarayanan, B. (2023). Self-Evaluation Improves Selective Generation in Large Language Models. arXiv [Cs.CL]. http://arxiv.org/abs/2312.09300
To cite this abstract in AMA style:
B. Várkuti, L. Manola, M. Atayi, L. Liemen, G. van Elswijk, F. Lange, M. Reich, L. Erōss, M. Okun, J. Volkmann. A Conversational GPT Agent for Parkinson’s Disease [abstract]. Mov Disord. 2024; 39 (suppl 1). https://www.mdsabstracts.org/abstract/a-conversational-gpt-agent-for-parkinsons-disease/. Accessed December 3, 2024.« Back to 2024 International Congress
MDS Abstracts - https://www.mdsabstracts.org/abstract/a-conversational-gpt-agent-for-parkinsons-disease/