ML Research Wiki / Benchmarks / Semantic Segmentation / COCO (Common Objects in Context)

COCO (Common Objects in Context)

Semantic Segmentation Benchmark

Performance Over Time

📊 Showing 9 results | 📏 Metric: mIoU

Top Performing Models

Rank Model Paper mIoU Date Code
1 HyperSeg 📚 HyperSeg: Towards Universal Visual Segmentation with Large Language Model 77.20 2024-11-26 📦 congvvc/HyperSeg
2 ViT-P (OneFormer, InternImage-H) The Missing Point in Vision Transformers for Universal Image Segmentation 69.10 2025-05-26 📦 sajjad-sh33/vit-p
3 OneFormer (InternImage-H, emb_dim=1024, single-scale) OneFormer: One Transformer to Rule Universal Image Segmentation 68.80 2022-11-10 📦 huggingface/transformers 📦 SHI-Labs/OneFormer 📦 yangyucheng000/University 📦 MindCode-4/code-2
4 ViT-P (OneFormer, DiNAT-L) The Missing Point in Vision Transformers for Universal Image Segmentation 68.80 2025-05-26 📦 sajjad-sh33/vit-p
5 OneFormer (DiNAT-L, single-scale) OneFormer: One Transformer to Rule Universal Image Segmentation 68.10 2022-11-10 📦 huggingface/transformers 📦 SHI-Labs/OneFormer 📦 yangyucheng000/University 📦 MindCode-4/code-2
6 OneFormer (Swin-L, single-scale) OneFormer: One Transformer to Rule Universal Image Segmentation 67.40 2022-11-10 📦 huggingface/transformers 📦 SHI-Labs/OneFormer 📦 yangyucheng000/University 📦 MindCode-4/code-2
7 Mask2Former (Swin-L, single-scale) Masked-attention Mask Transformer for Universal Image Segmentation 67.40 2021-12-02 📦 huggingface/transformers 📦 open-mmlab/mmdetection 📦 facebookresearch/Mask2Former
8 MaskFormer (Swin-L, single-scale) Masked-attention Mask Transformer for Universal Image Segmentation 64.80 2021-12-02 📦 huggingface/transformers 📦 open-mmlab/mmdetection 📦 facebookresearch/Mask2Former
9 SegCLIP SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation 26.50 2022-11-27 📦 arrowluo/segclip

All Papers (9)