Glance LG, Nerenz DR, Joynt Maddox KE, Hall BL, and Dick AW. Reproducibility of Hospital Rankings Based on Centers for Medicare & Medicaid Services Hospital Compare Measures as a Function of Measure Reliability. JAMA Netw Open 2021; 4(12):e2137647.
JAMA Netw Open
Importance: Unreliable performance measures can mask poor-quality care and distort financial incentives in value-based purchasing.
Objective: To examine the association between test-retest reliability and the reproducibility of hospital rankings.
Design, Setting, and Participants: In a cross-sectional design, Centers for Medicare & Medicaid Services Hospital Compare data were analyzed for the 2017 (based on 2014-2017 data) and 2018 (based on 2015-2018 data) reporting periods. The study was conducted from December 13, 2020, to September 30, 2021. This analysis was based on 28 measures, including mortality (acute myocardial infarction, congestive heart failure, pneumonia, and coronary artery bypass grafting), readmissions (acute myocardial infarction, congestive heart failure, pneumonia, and coronary artery bypass grafting), and surgical complications (postoperative acute kidney failure, postoperative respiratory failure, postoperative sepsis, and failure to rescue).
Exposures: Measure reliability based on test-retest reliability testing.
Main Outcomes and Measures: The reproducibility of hospital rankings was quantified by calculating the reclassification rate across the 2017 and 2018 reporting periods after categorizing the hospitals into terciles, quartiles, deciles, and statistical outliers. Linear regression analysis was used to examine the association between the reclassification rate and the intraclass correlation coefficient for each of the classification systems.
Results: The analytic cohort consisted of 28 measures from 4452 hospitals with a median of 2927 (IQR, 2378-3160) hospitals contributing data for each measure. The hospitals participating in the Inpatient Prospective Payment System (n = 3195) had a median bed size of 141 (IQR, 69-261), average daily census of 70 (IQR, 24-155) patients, and a median disproportionate share hospital percentage of 38.2% (IQR, 18.7%-36.6%). The median intraclass correlation coefficient was 0.78 (IQR, 0.72-0.81), ranging between 0.50 and 0.85. The median reclassification rate was 70% (IQR, 62%-71%) when hospitals were ranked by deciles, 43% (IQR, 39%-45%) when ranked by quartiles, 34% (IQR, 31%-36%) when ranked by terciles, and 3.8% (IQR, 2.0%-6.2%) when ranked by outlier status. Increases in measure reliability were not associated with decreases in the reclassification rate. Each 0.1-point increase in the intraclass correlation coefficient was associated with a 6.80 (95% CI, 2.28-11.30; P = .005) percentage-point increase in the reclassification rate when hospitals were ranked into performance deciles, 4.15 (95% CI, 1.16-7.14; P = .008) when ranked into performance quartiles, 1.47 (95% CI, 1.84, 4.77; P = .37) when ranked into performance terciles, and 3.70 (95% CI, 1.30-6.09; P = .004) when ranked by outlier status.
Conclusions and Relevance: In this study, more reliable measures were not associated with lower rates of reclassifying hospitals using test-retest reliability testing. These findings suggest that measure reliability should not be assessed with test-retest reliability testing.
Medical Subject Headings
Cross-Sectional Studies; Hospitals; Humans; Medicaid; Medicare; Quality of Health Care; Reproducibility of Results; United States