← ML Research Wiki / 2402.02491

VM-UNet: Vision Mamba UNet for Medical Image Segmentation

Jiacheng Ruan [email protected] Shanghai Jiao Tong University, Suncheng Xiang [email protected] Shanghai Jiao Tong University (2024)

Paper Information

arXiv ID

2402.02491

Venue

arXiv.org

Domain

medical image segmentation

Code

Available

Reproducibility

7/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

In the realm of medical image segmentation, both CNNbased and Transformer-based models have been extensively explored.However, CNNs exhibit limitations in long-range modeling capabilities, whereas Transformers are hampered by their quadratic computational complexity.Recently, State Space Models (SSMs), exemplified by Mamba, have emerged as a promising approach.They not only excel in modeling long-range interactions but also maintain a linear computational complexity.In this paper, leveraging state space models, we propose a Ushape architecture model for medical image segmentation, named Vision Mamba UNet (VM-UNet).Specifically, the Visual State Space (VSS) block is introduced as the foundation block to capture extensive contextual information, and an asymmetrical encoder-decoder structure is constructed.We conduct comprehensive experiments on the ISIC17, ISIC18, and Synapse datasets, and the results indicate that VM-UNet performs competitively in medical image segmentation tasks.To our best knowledge, this is the first medical image segmentation model constructed based on the pure SSM-based model.We aim to establish a baseline and provide valuable insights for the future development of more efficient and effective SSM-based segmentation systems.Our code is available at https://github.com/JCruan519/VM-UNet.

Summary

This paper introduces VM-UNet, a novel medical image segmentation model based on pure State Space Models (SSMs), specifically designed to overcome the limitations of conventional CNN and Transformer architectures in handling long-range dependencies with efficiency. The model comprises three main components: an encoder with Visual State Space (VSS) blocks for feature extraction, a decoder for output restoration, and simple skip connections to optimize performance. Extensive experiments are conducted on ISIC17, ISIC18, and Synapse datasets, demonstrating that VM-UNet achieves competitive segmentation results, thereby establishing a baseline for future SSM-based approaches. The authors emphasize the model's potential applications and outline future directions for improving segmentation efficiency, including module design and compression strategies.

Methods

This paper employs the following methods:

State Space Models (SSMs)
Visual State Space (VSS) block

Models Used

VM-UNet
Mamba
VMamba

Datasets

The following datasets were used in this research:

ISIC17
ISIC18
Synapse

Evaluation Metrics

Mean Intersection over Union (mIoU)
Dice Similarity Coefficient (DSC)
Accuracy (Acc)
Sensitivity (Sen)
Specificity (Spe)
95% Hausdorff Distance (HD95)

Results

VM-UNet demonstrates competitive performance in medical image segmentation tasks on ISIC17, ISIC18, and Synapse datasets.
Establishes a baseline for pure SSM-based segmentation models.
Achieves superior results compared to state-of-the-art models on evaluated metrics.

Limitations

The authors identified the following limitations:

Potential need for additional specialized modules to further enhance segmentation tasks.
Initial parameter count of 30M may limit real-world applications without optimization.

Technical Requirements

Number of GPUs: 1
GPU Type: NVIDIA RTX A6000

Keywords

visual state space (VSS) blocks medical image segmentation state space models (SSMs) UNet architecture deep learning

Papers Using Similar Methods

External Resources

Funding: Not specified
References: 35
Influential Citations: 13

VM-UNet: Vision Mamba UNet for Medical Image Segmentation

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers