Center for Health Policy and Health Services Research Meeting Abstracts

Leveraging AI to Personalize Symptom Assessment for Bipolar Disorder: A Comparative Study of LLM-driven PHQ-9 Tailoring

Shruti Ramekar
Yanwei Liu
Mason Breitzig, Henry Ford HealthFollow
Lan Kong
Erika Saunders
Guodong Liu

Recommended Citation

Ramekar S, Liu Y, Breitzig M, Kong L, Saunders E, Liu G. Leveraging AI to Personalize Symptom Assessment for Bipolar Disorder: A Comparative Study of LLM-driven PHQ-9 Tailoring. Bipolar Disorders 2025; 27:S105.

Document Type

Conference Proceeding

Publication Date

9-12-2025

Publication Title

Bipolar Disorders

Keywords

adult, bipolar disorder, bipolar I disorder, ChatGPT, comparative study, conference abstract, controlled study, female, hanging, human, large language model, leisure, male, patient engagement, Patient Health Questionnaire 9, pleasure, qualitative analysis, questionnaire, simulation, social status, symptom assessment, track and field

Abstract

Introduction: This study evaluates the ability of large language model (LLM)-based tools in customizing the Patient Health Questionnaire-9 (PHQ-9) for individuals with bipolar disorder. Method: We simulated 50 cases with a diverse background in demographics and other characteristics such as age, gender, bipolar type, socioeconomic status, interests/hobbies, and social/ emotional characteristics/tendency. ChatGPT®, Gemini®, Microsoft Copilot®, and Claude® were used to adapt the standard PHQ-9 questions to individual cases. A qualitative analysis was carried out by two independent evaluators to assess the quality of PHQ-9 adaptation, contextual relevance, linguistic sensitivity, clarity, and the level of personalization. Results: ChatGPT had the most success, followed by Claude, then Gemini and Copilot. Adaptations by ChatGPT were highly tailored to individual's background with sensitive tone, and more conversational and fiducial to original questions. For one simulated case: Alice, a female college freshman student, upper-middle class, Bipolar I, finance major, on College Track and Field team, a PHQ-9 question 'Little interest or pleasure in doing things' was personalized by ChatGPT as 'Alice, have you felt like you're not really enjoying things you normally do? Like, have you been feeling less excited about practice, hanging out with friends, or even school activities lately?'. Questions generated by Claude had a more empathetic tone, but deviated stylistically from the originals. Less or minimal personalization was achieved by Gemini and Copilot. Conclusion: Our study demonstrated the utility of LLM-based tool in personalizing PHQ-9 questionnaire. Further studies are warranted to investigate if this AI-alternative may help promote patient engagement and improve the accuracy of assessment.

Volume

First Page

S105

Find It @ Sladen

COinS

Center for Health Policy and Health Services Research Meeting Abstracts

Leveraging AI to Personalize Symptom Assessment for Bipolar Disorder: A Comparative Study of LLM-driven PHQ-9 Tailoring

Recommended Citation

Document Type

Publication Date

Publication Title

Keywords

Abstract

Volume

First Page

Browse

Author Corner

Center for Health Policy and Health Services Research Meeting Abstracts

Leveraging AI to Personalize Symptom Assessment for Bipolar Disorder: A Comparative Study of LLM-driven PHQ-9 Tailoring

Authors

Recommended Citation

Document Type

Publication Date

Publication Title

Keywords

Abstract

Volume

First Page

Share

Browse

Author Corner