Hypersim

Dataset Information
Modalities
Images, Point cloud, 3d meshes, RGB-D
License
Homepage

Overview

For many fundamental scene understanding tasks, it is difficult or impossible to obtain per-pixel ground truth labels from real images. Hypersim is a photorealistic synthetic dataset for holistic indoor scene understanding. It contains 77,400 images of 461 indoor scenes with detailed per-pixel labels and corresponding ground truth geometry.

Source: https://github.com/apple/ml-hypersim
Image Source: https://github.com/apple/ml-hypersim

Variants: Hypersim

Associated Benchmarks

This dataset is used in 3 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Semantic Segmentation EMSANet (2x ResNet-34 NBt1D) PanopticNDT: Efficient and Robust Panoptic … 2023-09-24
3D Semantic Segmentation PanopticNDT (10cm) PanopticNDT: Efficient and Robust Panoptic … 2023-09-24
3D Semantic Segmentation SemanticNDT (10cm) PanopticNDT: Efficient and Robust Panoptic … 2023-09-24
Panoptic Segmentation EMSANet (2x ResNet-34 NBt1D) PanopticNDT: Efficient and Robust Panoptic … 2023-09-24
Semantic Segmentation MoCo-v3 (ViT-B) MultiMAE: Multi-modal Multi-task Masked Autoencoders 2022-04-04
Semantic Segmentation MultiMAE (ViT-B) MultiMAE: Multi-modal Multi-task Masked Autoencoders 2022-04-04
Semantic Segmentation MAE (ViT-B) MultiMAE: Multi-modal Multi-task Masked Autoencoders 2022-04-04
Semantic Segmentation DINO (ViT-B) MultiMAE: Multi-modal Multi-task Masked Autoencoders 2022-04-04

Research Papers

Recent papers with results on this dataset: