Medical image analysis is critical yet challenged by the need of jointly segmenting organs or tissues, and numerous instances for anatomical structures and tumor microenvironment analysis.Existing studies typically formulated different segmentation tasks in isolation, which overlooks the fundamental interdependencies between these tasks, leading to suboptimal segmentation performance and insufficient medical image understanding.To address this issue, we propose a Co-Seg++ framework for versatile medical segmentation.Specifically, we introduce a novel co-segmentation paradigm, allowing semantic and instance segmentation tasks to mutually enhance each other.We first devise a spatio-temporal prompt encoder (STP-Encoder) to capture long-range spatial and temporal relationships between segmentation regions and image embeddings as prior spatial constraints.Moreover, we devise a multi-task collaborative decoder (MTC-Decoder) that leverages cross-guidance to strengthen the contextual consistency of both tasks, jointly computing semantic and instance segmentation masks.Extensive experiments on diverse CT and histopathology datasets demonstrate that the proposed Co-Seg++ outperforms state-of-the-arts in the semantic, instance, and panoptic segmentation of dental anatomical structures, histopathology tissues, and nuclei instances.The source code is available at https: //github.com/xq141839/Co-Seg-Plus.
The paper presents Co-Seg++, a framework designed for versatile medical segmentation by employing mutual prompt-guided collaborative learning. It addresses challenges in medical image analysis, specifically the interdependence between semantic and instance segmentation tasks. The proposed method utilizes a spatio-temporal prompt encoder (STP-Encoder) to integrate spatial and temporal relationships, and a multi-task collaborative decoder (MTC-Decoder) to enhance contextual consistency between tasks. The framework is validated through extensive experiments on various datasets, revealing substantial improvements over state-of-the-art methods in semantic, instance, and panoptic segmentation tasks across medical imaging scenarios, including histopathology and dental CBCT datasets.
This paper employs the following methods:
- Spatio-temporal prompt encoder (STP-Encoder)
- Multi-task collaborative decoder (MTC-Decoder)
The following datasets were used in this research:
- PUMA
- GlaS
- CRAG
- ToothFairy2
- Dice coefficient
- mean intersection over union (mIoU)
- Hausdorff distance (HD)
- object-level F1 score
- aggregated Jaccard index (AJI)
- panoptic quality (PQ)
- Co-Seg++ outperforms state-of-the-art methods in segmentation tasks
- Significant improvements in Dice scores across various datasets
- State-of-the-art performance achieved in both semantic and instance segmentation tasks
- Number of GPUs: 1
- GPU Type: NVIDIA A5000
- Compute Requirements: batch size of 16 and the training epoch is set to 300