Machine learning-based mortality prediction models using national liver transplantation registries are feasible but have limited utility across countries
Ivanics T, So D, Claasen M, Wallace D, Patel MS, Gravely A, Choi WJ, Shwaartz C, Walker K, Erdman L, and Sapisochin G. Machine learning-based mortality prediction models using national liver transplantation registries are feasible but have limited utility across countries. Am J Transplant 2023; 23(1):64-71.
American journal of transplantation
Many countries curate national registries of liver transplant (LT) data. These registries are often used to generate predictive models; however, potential performance and transferability of these models remain unclear. We used data from 3 national registries and developed machine learning algorithm (MLA)-based models to predict 90-day post-LT mortality within and across countries. Predictive performance and external validity of each model were assessed. Prospectively collected data of adult patients (aged ≥18 years) who underwent primary LTs between January 2008 and December 2018 from the Canadian Organ Replacement Registry (Canada), National Health Service Blood and Transplantation (United Kingdom), and United Network for Organ Sharing (United States) were used to develop MLA models to predict 90-day post-LT mortality. Models were developed using each registry individually (based on variables inherent to the individual databases) and using all 3 registries combined (variables in common between the registries [harmonized]). The model performance was evaluated using area under the receiver operating characteristic (AUROC) curve. The number of patients included was as follows: Canada, n = 1214; the United Kingdom, n = 5287; and the United States, n = 59,558. The best performing MLA-based model was ridge regression across both individual registries and harmonized data sets. Model performance diminished from individualized to the harmonized registries, especially in Canada (individualized ridge: AUROC, 0.74; range, 0.73-0.74; harmonized: AUROC, 0.68; range, 0.50-0.73) and US (individualized ridge: AUROC, 0.71; range, 0.70-0.71; harmonized: AUROC, 0.66; range, 0.66-0.66) data sets. External model performance across countries was poor overall. MLA-based models yield a fair discriminatory potential when used within individual databases. However, the external validity of these models is poor when applied across countries. Standardization of registry-based variables could facilitate the added value of MLA-based models in informing decision making in future LTs.
Medical Subject Headings
Adult; Humans; Adolescent; Liver Transplantation; State Medicine; Canada; Machine Learning; Registries; Retrospective Studies