Performance of large language models in addressing patient queries on colorectal cancer screening in different languages: An international study across 28 countries

Document Type

Article

Publication Date

2-1-2026

Publication Title

Digestive and liver disease : official journal of the Italian Society of Gastroenterology and the Italian Association for the Study of the Liver

Keywords

Humans, Colorectal Neoplasms, Early Detection of Cancer, Language, Comprehension, Surveys and Questionnaires, Asia, Multilingualism, Europe, Male, Female, Mass Screening, Africa, Large Language Models

Abstract

BACKGROUND: Colorectal cancer (CRC) screening reduces incidence and mortality, yet patient adherence remains suboptimal. Large language models may improve participation by addressing patient questions in native languages, but their multilingual performance has not been systematically assessed.

METHODS: From April to June 2025, we conducted a cross-continental study involving 28 countries and 23 languages. A standardized set of 15 CRC screening-related questions was translated into each language and submitted to ChatGPT (GPT-4o). Responses were independently evaluated by 140 gastroenterologists (five per country) for accuracy, completeness, and comprehensibility on a 5-point Likert scale. Statistical analyses included t-test, Chi-square, and two-way ANOVA.

RESULTS: The study included experts and data from Europe, Asia, Africa, America, and Oceania. Mean scores (±SD) for accuracy, completeness, and comprehensibility were 4.1 ± 1.0, 4.1 ± 1.0, and 4.2 ± 0.9, respectively. Most languages achieved high ratings, with 73.9%, 86.9%, and 82.6% scoring ≥4 for accuracy, completeness, and comprehensibility. However, lower scores were observed in Chinese, Dutch, and Greek. Variability was also noted between countries sharing the same language, highlighting language- and context-dependent performance.

DISCUSSION: ChatGPT showed strong ability to answer CRC screening questions across multiple languages, supporting its promise as a multilingual patient education tool. Nonetheless, regional variability requires careful validation before clinical integration.

Medical Subject Headings

Humans; Colorectal Neoplasms; Early Detection of Cancer; Language; Comprehension; Surveys and Questionnaires; Asia; Multilingualism; Europe; Male; Female; Mass Screening; Africa; Large Language Models

PubMed ID

41436291

ePublication

ePub ahead of print

Volume

58

Issue

2

First Page

250

Last Page

257

Share

COinS