Cityscapes

Dataset Information
Modalities
Images
Introduced
2016
License
Homepage

Overview

Cityscapes is a large-scale database which focuses on semantic understanding of urban street scenes. It provides semantic, instance-wise, and dense pixel annotations for 30 classes grouped into 8 categories (flat surfaces, humans, vehicles, constructions, objects, nature, sky, and void). The dataset consists of around 5000 fine annotated images and 20000 coarse annotated ones. Data was captured in 50 cities during several months, daytimes, and good weather conditions. It was originally recorded as video so the frames were manually selected to have the following features: large number of dynamic objects, varying scene layout, and varying background.

Source: A Review on Deep Learning Techniques Applied to Semantic Segmentation
Image Source: https://www.cityscapes-dataset.com/dataset-overview/

Variants: Semi-Supervised Semantic Segmentation on Cityscapes 6.25% labeled, Semi-Supervised Semantic Segmentation on Cityscapes 12.5% labeled, Cityscapes with extra (no coarse labels), Cityscapes with extra (no coarse), Cityscapes heterogeneous, Cityscapes 6.25% labeled, Cityscapes 5% labeled, Cityscapes 2% labeled, Cityscapes 128x128, Cityscapes 93 labeled, Cityscapes 10% labeled, Cityscapes, Cityscapes-5K 256x512, Cityscapes-25K 256x512, Cityscapes val, Cityscapes test, Cityscapes Photo-to-Labels, Cityscapes Labels-to-Photo, Cityscapes 50% labeled, Cityscapes 25% labeled, Cityscapes 12.5% labeled, Cityscapes 100 samples labeled

Associated Benchmarks

This dataset is used in 8 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Instance Segmentation CAST CAST: Contrastive Adaptation and Distillation … 2025-05-28
Knowledge Distillation CAST CAST: Contrastive Adaptation and Distillation … 2025-05-28
Semi-Supervised Instance Segmentation CAST CAST: Contrastive Adaptation and Distillation … 2025-05-28
Unsupervised Panoptic Segmentation CUPS (40 pseudo-classes) Scene-Centric Unsupervised Panoptic Segmentation 2025-04-02
Unsupervised Panoptic Segmentation CUPS (54 pseudo-classes) Scene-Centric Unsupervised Panoptic Segmentation 2025-04-02
Unsupervised Panoptic Segmentation CUPS (27 pseudo-classes) Scene-Centric Unsupervised Panoptic Segmentation 2025-04-02
Open Vocabulary Semantic Segmentation TTD (MaskCLIP) TTD: Text-Tag Self-Distillation Enhancing Image-Text … 2024-03-30
Open Vocabulary Semantic Segmentation TTD (TCL) TTD: Text-Tag Self-Distillation Enhancing Image-Text … 2024-03-30
Unsupervised Panoptic Segmentation U2Seg (827 pseudo-classes) Unsupervised Universal Image Segmentation 2023-12-28
Unsupervised Panoptic Segmentation DepthG + CutLER Unsupervised Semantic Segmentation Through Depth-Guided … 2023-09-21
Semantic Segmentation DiffSeg (512) Diffuse, Attend, and Segment: Unsupervised … 2023-08-23
Open Vocabulary Semantic Segmentation FC-CLIP Convolutions Die Hard: Open-Vocabulary Segmentation … 2023-08-04
Open Vocabulary Semantic Segmentation CLIP Surgery (CLIP without any fine-tuning) A Closer Look at the … 2023-04-12
Video Prediction DMVFN A Dynamic Multi-Scale Voxel Flow … 2023-03-17
Semantic Segmentation SPFNet34M S$^2$-FPN: Scale-ware Strip Attention Guided … 2022-06-15
Open Vocabulary Semantic Segmentation SimSeg A Simple Baseline for Open-Vocabulary … 2021-12-29
Image Generation Projected GAN Projected GANs Converge Faster 2021-11-01
Image Generation SAGAN Generative Adversarial Transformers 2021-03-01
Image Generation VQGAN Generative Adversarial Transformers 2021-03-01
Image Generation GAN Generative Adversarial Transformers 2021-03-01

Research Papers

Recent papers with results on this dataset: