← ML Research Wiki / 2402.15648

MambaIR: A Simple Baseline for Image Restoration with State-Space Model

Hang Guo Tsinghua Shenzhen International Graduate School Tsinghua University, Jinmin Li Tsinghua Shenzhen International Graduate School Tsinghua University, Tao Dai College of Computer Science and Software Engineering Shenzhen University, Zhihao Ouyang [email protected] ByteDance Inc, Xudong Ren Tsinghua Shenzhen International Graduate School Tsinghua University, Shu-Tao Xia Tsinghua Shenzhen International Graduate School Tsinghua University Peng Cheng Laboratory (2024)

Paper Information

arXiv ID

2402.15648

Venue

European Conference on Computer Vision

Domain

Computer Vision

SOTA Claim

Yes

Code

Available

Reproducibility

7/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

Recent years have seen significant advancements in image restoration, largely attributed to the development of modern deep neural networks, such as CNNs and Transformers.However, existing restoration backbones often face the dilemma between global receptive fields and efficient computation, hindering their application in practice.Recently, the Selective Structured State Space Model, especially the improved version Mamba, has shown great potential for long-range dependency modeling with linear complexity, which offers a way to resolve the above dilemma.However, the standard Mamba still faces certain challenges in low-level vision such as local pixel forgetting and channel redundancy.In this work, we introduce a simple but effective baseline, named MambaIR, which introduces both local enhancement and channel attention to improve the vanilla Mamba.In this way, our MambaIR takes advantage of the local pixel similarity and reduces the channel redundancy.Extensive experiments demonstrate the superiority of our method, for example, MambaIR outperforms SwinIR by up to 0.45dB on image SR, using similar computational cost but with a global receptive field.Code is available at https://github.com/csguoh/MambaIR.

Summary

This paper presents MambaIR, a new baseline for image restoration leveraging the Selective Structured State Space Model (Mamba). The standard Mamba model faces challenges in low-level vision tasks, such as pixel forgetting and channel redundancy. MambaIR intends to improve these limitations by incorporating local enhancements and channel attention. Extensive experiments demonstrate that MambaIR outperforms existing models like SwinIR in image super-resolution tasks while maintaining similar computational efficiency. The approach shows promising scalability and robustness across various restoration tasks, including denoising and JPEG compression artifact reduction.

Methods

This paper employs the following methods:

Mamba
Selective Structured State Space Model
Residual State Space Blocks (RSSBs)
Vision State-Space Module (VSSM)

Models Used

Mamba
SwinIR

Datasets

The following datasets were used in this research:

DIV2K
Flickr2K
Set5
Set14
B100
Urban100
Manga109
BSD500
WED
BSD68
Kodak24
McMaster
SIDD
DND

Evaluation Metrics

PSNR
SSIM
L1 loss
Charbonnier loss

Results

MambaIR outperforms SwinIR by up to 0.45dB on image SR
Achieves superior performance on image denoising and JPEG compression artifact reduction

Limitations

The authors identified the following limitations:

Local pixel forgetting
Channel redundancy in the standard Mamba model

Technical Requirements

Number of GPUs: 8
GPU Type: NVIDIA V100

Keywords

Image Restoration State-Space Model Global Receptive Field Long-range Dependency Modeling

Papers Using Similar Methods

External Resources

Funding: None specified
References: 95
Influential Citations: 27

MambaIR: A Simple Baseline for Image Restoration with State-Space Model

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers