← ML Research Wiki / 2506.17159

Co-Seg++: Mutual Prompt-Guided Collaborative Learning for Versatile Medical Segmentation

(2025)

Paper Information

arXiv ID

2506.17159

Contents

Abstract
Methods
Datasets
Results
Related Work
External Resources

Abstract

Medical image analysis is critical yet challenged by the need of jointly segmenting organs or tissues, and numerous instances for anatomical structures and tumor microenvironment analysis.Existing studies typically formulated different segmentation tasks in isolation, which overlooks the fundamental interdependencies between these tasks, leading to suboptimal segmentation performance and insufficient medical image understanding.To address this issue, we propose a Co-Seg++ framework for versatile medical segmentation.Specifically, we introduce a novel co-segmentation paradigm, allowing semantic and instance segmentation tasks to mutually enhance each other.We first devise a spatio-temporal prompt encoder (STP-Encoder) to capture long-range spatial and temporal relationships between segmentation regions and image embeddings as prior spatial constraints.Moreover, we devise a multi-task collaborative decoder (MTC-Decoder) that leverages cross-guidance to strengthen the contextual consistency of both tasks, jointly computing semantic and instance segmentation masks.Extensive experiments on diverse CT and histopathology datasets demonstrate that the proposed Co-Seg++ outperforms state-of-the-arts in the semantic, instance, and panoptic segmentation of dental anatomical structures, histopathology tissues, and nuclei instances.The source code is available at https: //github.com/xq141839/Co-Seg-Plus.

Summary

The paper presents Co-Seg++, a framework designed for versatile medical segmentation by employing mutual prompt-guided collaborative learning. It addresses challenges in medical image analysis, specifically the interdependence between semantic and instance segmentation tasks. The proposed method utilizes a spatio-temporal prompt encoder (STP-Encoder) to integrate spatial and temporal relationships, and a multi-task collaborative decoder (MTC-Decoder) to enhance contextual consistency between tasks. The framework is validated through extensive experiments on various datasets, revealing substantial improvements over state-of-the-art methods in semantic, instance, and panoptic segmentation tasks across medical imaging scenarios, including histopathology and dental CBCT datasets.

Methods

This paper employs the following methods:

Spatio-temporal prompt encoder (STP-Encoder)
Multi-task collaborative decoder (MTC-Decoder)

Models Used

Co-Seg++

Datasets

The following datasets were used in this research:

PUMA
GlaS
CRAG
ToothFairy2

Evaluation Metrics

Dice coefficient
mean intersection over union (mIoU)
Hausdorff distance (HD)
object-level F1 score
aggregated Jaccard index (AJI)
panoptic quality (PQ)

Results

Co-Seg++ outperforms state-of-the-art methods in segmentation tasks
Significant improvements in Dice scores across various datasets
State-of-the-art performance achieved in both semantic and instance segmentation tasks

Technical Requirements

Number of GPUs: 1
GPU Type: NVIDIA A5000
Compute Requirements: batch size of 16 and the training epoch is set to 300

External Resources

References: 46

Co-Seg++: Mutual Prompt-Guided Collaborative Learning for Versatile Medical Segmentation

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Technical Requirements edit

Related Papers