← ML Research Wiki / 2506.17136

Semi-Supervised Multi-Modal Medical Image Segmentation for Complex Situations

(2025)

Paper Information
arXiv ID

Abstract

Semi-supervised learning addresses the issue of limited annotations in medical images effectively, but its performance is often inadequate for complex backgrounds and challenging tasks.Multi-modal fusion methods can significantly improve the accuracy of medical image segmentation by providing complementary information.However, they face challenges in achieving significant improvements under semisupervised conditions due to the challenge of effectively leveraging unlabeled data.There is a significant need to create an effective and reliable multi-modal learning strategy for leveraging unlabeled data in semisupervised segmentation.To address these issues, we propose a novel semi-supervised multi-modal medical image segmentation approach, which leverages complementary multi-modal information to enhance performance with limited labeled data.Our approach employs a multi-stage multi-modal fusion and enhancement strategy to fully utilize complementary multi-modal information, while reducing feature discrepancies and enhancing feature sharing and alignment.Furthermore, we effectively introduce contrastive mutual learning to constrain prediction consistency across modalities, thereby facilitating the robustness of segmentation results in semi-supervised tasks.Experimental results on two multi-modal datasets demonstrate the superior performance and robustness of the proposed framework, establishing its valuable potential for solving medical image segmentation tasks in complex scenarios.

Summary

This paper presents a novel semi-supervised multi-modal medical image segmentation method designed to address limitations arising from scarcity of labeled data in medical images. The proposed approach utilizes a multi-stage multi-modal fusion and enhancement strategy to effectively combine information from different modalities and improve segmentation accuracy despite limited annotations. Furthermore, contrastive mutual learning is employed to ensure consistency across modalities and enhance robustness in segmentation results. Experimental evaluations were performed on two datasets, the BraTS 2019 and a private nasopharyngeal carcinoma (NPC) dataset, demonstrating superior performance compared to existing state-of-the-art methods in complex medical image segmentation tasks.

Methods

This paper employs the following methods:

  • Semi-supervised learning
  • Multi-modal fusion
  • Contrastive mutual learning

Models Used

  • None specified

Datasets

The following datasets were used in this research:

  • BraTS 2019
  • NPC Dataset

Evaluation Metrics

  • Dice Coefficient (DSC)
  • Average Surface Distance (ASD)

Results

  • Achieved high accuracy for complex segmentation targets with limited labeled data
  • Outperformed state-of-the-art methods in tumor segmentation tasks

Limitations

The authors identified the following limitations:

  • Limited availability of large, high-quality labeled training datasets

Technical Requirements

  • Number of GPUs: 1
  • GPU Type: NVIDIA A6000
  • Compute Requirements: batch size of 4, a maximum of 60k iterations

Papers Using Similar Methods

External Resources