Enhancing CT image segmentation accuracy through ensemble loss function optimization
Recommended Citation
Li C, Sultan R, Bagher-Ebadian H, Qiang Y, Thind K, Zhu D, and Chetty IJ. Enhancing CT image segmentation accuracy through ensemble loss function optimization. Med Phys 2025;52(7):17848.
Document Type
Article
Publication Date
7-1-2025
Publication Title
Medical physics
Abstract
BACKGROUND: In CT-based medical image segmentation, the choice of loss function profoundly impacts the training efficacy of deep neural networks. Traditional loss functions like cross entropy (CE), Dice, Boundary, and TopK each have unique strengths and limitations, often introducing biases when used individually.
PURPOSE: This study aims to enhance segmentation accuracy by optimizing ensemble loss functions, thereby addressing the biases and limitations of single loss functions and their linear combinations.
METHODS: We implemented a comprehensive evaluation of loss function combinations by integrating CE, Dice, Boundary, and TopK loss functions through both loss-level linear combination and model-level ensemble methods. Our approach utilized two state-of-the-art 3D segmentation architectures, Attention U-Net (AttUNet) and SwinUNETR, to test the impact of these methods. The study was conducted on two large CT dataset cohorts: an institutional dataset containing pelvic organ segmentations, and a public dataset consisting of multiple organ segmentations. All the models were trained from scratch with different loss settings, and performance was evaluated using Dice similarity coefficient (DSC), Hausdorff distance (HD), and average surface distance (ASD). In the ensemble approach, both static averaging and learnable dynamic weighting strategies were employed to combine the outputs of models trained with different loss functions.
RESULTS: Extensive experiments revealed the following: (1) the linear combination of loss functions achieved results comparable to those of single loss-driven methods; (2) compared to the best non-ensemble methods, ensemble-based approaches resulted in a 2%-7% increase in DSC scores, along with notable reductions in HD (e.g., a 19.1% reduction for rectum segmentation using SwinUNETR) and ASD (e.g., a 49.0% reduction for prostate segmentation using AttUNet); (3) the learnable ensemble approach with optimized weights produced finer details in predicted masks, as confirmed by qualitative analyses; and (4) the learnable ensemble consistently outperforms the static ensemble across most metrics (DSC, HD, ASD) for both AttUNet and SwinUNETR architectures.
CONCLUSIONS: Our findings support the efficacy of using ensemble models with optimized weights to improve segmentation accuracy, highlighting the potential for broader applications in automated medical image analysis.
Medical Subject Headings
Tomography; X-Ray Computed; Image Processing; Computer-Assisted; Humans
PubMed ID
40275531
ePublication
ePub ahead of print
Volume
52
Issue
7
First Page
e17848
