Utilizing Semi-Supervised Learning and Image Matting in Combination With Mask R-CNN for Accurate Dominant Intraprostatic Lesion Identification and Segmentation on Multiparametric-MRI
Radiation Oncology Feldman AM, Dai Z, Zong W, Pantelic M, Elshaikh MA, and Wen N. Utilizing Semi-Supervised Learning and Image Matting in Combination With Mask R-CNN for Accurate Dominant Intraprostatic Lesion Identification and Segmentation on Multiparametric-MRI. International Journal of Radiation Oncology Biology Physics 2020; 108(3):e257.
International Journal of Radiation Oncology Biology Physics
Purpose/Objective(s): Identification of the dominant intraprostatic lesion (DIL) using multiparametric-MRI (mp-MRI) can aid clinicians in the diagnosis, risk stratification, staging and therapeutic options in men with prostate cancer. Deep-learning based segmentation models such as Mask R-CNN are an emerging modality capable of identification and auto-segmentation of these lesions. However, model generation is limited by relatively sparse annotated data and anatomic challenges such as the ambiguous transition zone between the DIL and normal prostate tissue. Here we used a Mask R-CNN backbone in combination with semi-supervised training and image matting in an effort to overcome these limitations and achieve accurate segmentation of the DIL.
Materials/Methods: A total of 244 patients, split into 2 cohorts, with biopsy proven prostate adenocarcinoma and mp-MRI imaging, were reviewed. Cohort 1 included 202 patients from the SPIE-AAPM-NCI Prostate MR Gleason Grade Group Challenge (PROSTATEx-2 Challenge). Cohort 2 included 42 patients from our institution. All patients in cohort 2 and 96 patients in cohort 1 had the DIL annotated by two experienced clinicians from our institution on T2-weighted imaging (T2WI) to establish the ground truth. Apparent diffusion coefficient mapping was rigidly registered to T2WI. A base Mask R-CNN model was trained in a supervised fashion using 84 annotated patients gathered from both cohorts. The base model with the most confident label was then used to predict 106 cohort 1 patients without annotations. The 84 annotated patients and 106 self-annotated patients were then used as the training set to train a semi-supervised model. Finally, image matting was applied as a post-processing approach to refine the boundaries of detected lesions. Ten annotated cohort 1 patients were used as the validation set and 23 and 21 cohort 1 and cohort 2 patients, respectively, were used as the testing set. Dice similarity coefficient (DSC) and the 95th percentile Hausdorff distance (95 HD) were used as evaluation metrics. We defined agreement as the degree to which a model’s predictions concurred with the ground truth annotations.
Results: The DSC, 95HD (mm) and agreement for the validation on the base model were 0.659 ± 0.105, 3.75 ± 1.40, and 90.0%, respectively. For the testing set on the base model, these results were 0.564 ± 0.153, 4.52 ± 2.16 and 78.6%, respectively. When applying the semi-supervised model, the DSC, 95HD (mm) and agreement were 0.635 ± 0.142, 4.31 ± 1.34 and 100.0% for the validation set and 0.585 ± 0.146, 4.83 ± 2.53 and 76.8% on the testing set. Using image mapping, these values were increased to 0.725± 0.116, 3.70 ± 1.09, 100%; and 0.672 ± 0.123, 4.18 ± 1.97, 76.8% on validation and testing sets, respectively.
Conclusion: Semi-supervised learning offered limited improvements when applied to the Mask R-CNN backbone model. However, image matting proved to be a powerful tool in improving the segmentation of the DIL.