Application of radiomics for the prediction of HPV status for patients with head and neck cancers

Document Type


Publication Date


Publication Title

Medical physics


PURPOSE: To perform radiomic analysis of primary tumors extracted from pretreatment contrast-enhanced computed tomography (CE-CT) images for patients with oropharyngeal cancers to identify discriminant features and construct an optimal classifier for the characterization and prediction of human papilloma virus (HPV) status.

MATERIALS AND METHODS: One hundred and eighty seven patients with oropharyngeal cancers with known HPV status (confirmed by immunohistochemistry-p16 protein testing) were retrospectively studied as follows: Group A: 95 patients (19HPV- and 76HPV+) from the MICAII grand challenge. Group B: 92 patients (52HPV- and 40HPV+) from our institution. Radiomic features (172) were extracted from pretreatment diagnostic CE-CT images of the gross tumor volume (GTV). Levene and Kolmogorov-Smirnov's tests with absolute biserial correlation (>0.48) were used to identify the discriminant features between the HPV+ and HPV- groups. The discriminant features were used to train and test eight different classifiers. Area under receiver operating characteristic (AUC), positive predictive and negative predictive values (PPV and NPV, respectively) were used to evaluate the performance of the classifiers. Principal component analysis (PCA) was applied on the discriminant feature set and seven PCs were used to train and test a generalized linear model (GLM) classifier.

RESULTS: Among 172 radiomic features only 12 radiomic features (from 3 categories) were significantly different (P < 0.05, |BSC| > 0.48) between the HPV+ and HPV- groups. Among the eight classifiers trained and applied for prediction of HPV status, the GLM showed the highest performance for each discriminant feature and the combined 12 features: AUC/PPV/NPV = 0.878/0.834/0.811. The GLM high prediction power was AUC/PPV/NPV = 0.849/0.731/0.788 and AUC/PPV/NPV = 0.869/0.807/0.870 for unseen test datasets for groups A and B, respectively. After eliminating the correlation among discriminant features by applying PCA analysis, the performance of the GLM was improved by 3.3%, 2.2%, and 1.8% for AUC, PPV, and NPV, respectively.

CONCLUSION: Results imply that GTV's for HPV+ patients exhibit higher intensities, smaller lesion size, greater sphericity/roundness, and higher spatial intensity-variation/heterogeneity. Results are suggestive that radiomic features primarily associated with the spatial arrangement and morphological appearance of the tumor on contrast-enhanced diagnostic CT datasets may be potentially used for classification of HPV status.

PubMed ID



ePub ahead of print