FlyingThings3D is a synthetic dataset for optical flow, disparity and scene flow estimation. It consists of everyday objects flying along …
Music21 is an untrimmed video dataset crawled by keyword query from Youtube. It contains music performances belonging to 21 categories. …
CamVid (Cambridge-driving Labeled Video Database) is a road/driving scene understanding database which was originally captured as five video sequences with …
Stack of 2D gray images of glass fiber-reinforced polyamide 66 (GF-PA66) 3D X-ray Computed Tomography (XCT) specimen. Usage: 2D/3D image …
A Multi-Task 4D Radar-Camera Fusion Dataset for Autonomous Driving on Water Surfaces description of the dataset * WaterScenes, the first …
WildScenes is a bi-modal benchmark dataset consisting of multiple large-scale, sequential traversals in natural environments, including semantic annotations in high-resolution …
The xBD dataset contains over 45,000KM2 of polygon labeled pre and post disaster imagery. The dataset provides the post-disaster imagery …
The CIFAR-10 database (Canadian Institute For Advanced Research database) is a large collection of natural color images. It has a …
The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists …
WSJ0-2mix is a speech recognition corpus of speech mixtures using utterances from the Wall Street Journal (WSJ0) corpus. Source: [Deep …
The CIFAR-10 database (Canadian Institute For Advanced Research database) is a large collection of natural color images. It has a …
The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists …
The ImageNet dataset contains 14,197,122 annotated images according to the WordNet hierarchy. Since 2010 the dataset is used in the …
The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It has …
Adversarial GLUE (AdvGLUE) is a new multi-task benchmark to quantitatively and thoroughly explore and evaluate the vulnerabilities of modern large-scale …
The CIFAR-10 database (Canadian Institute For Advanced Research database) is a large collection of natural color images. It has a …
The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists …
The ImageNet dataset contains 14,197,122 annotated images according to the WordNet hierarchy. Since 2010 the dataset is used in the …
The ImageNet-A dataset consists of real-world, unmodified, and naturally occurring examples that are misclassified by ResNet models. Source: [On Robustness …
ImageNet-C is an open source data set that consists of algorithmically generated corruptions (blur, noise) applied to the ImageNet test-set. …
The Stylized-ImageNet dataset is created by removing local texture cues in ImageNet while retaining global shape information on natural images …
Data Set Information: Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records …
In an effort to catalog insect biodiversity, we propose a new large dataset of hand-labelled insect images, the BIOSCAN-1M Insect …
The purpose of this dataset was to study gender bias in occupations. Online biographies, written in English, were collected to …
BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring – they are …
This dataset is a combination of the following three datasets : figshare, SARTAJ dataset and Br35H This dataset contains 7022 …
The quality of AI-generated images has rapidly increased, leading to concerns of authenticity and trustworthiness. CIFAKE is a dataset that …
The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists …
Common corruptions dataset for CIFAR10
Contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it …
Data was collected for normal bearings, single-point drive end and fan end defects. Data was collected at 12,000 samples/second and …
The normal chest X-ray (left panel) depicts clear lungs without any areas of abnormal opacification in the image. Bacterial pneumonia …
We construct the ForgeryNet dataset, an extremely large face forgery dataset with unified annotations in image- and video-level data across …
A public data set of walking full-body kinematics and kinetics in individuals with Parkinson’s disease
HOWS-CL-25 (Household Objects Within Simulation dataset for Continual Learning) is a synthetic dataset especially designed for object classification on mobile …
The HRF dataset is a dataset for retinal vessel segmentation which comprises 45 images and is organized as 15 subsets. …
The IRFL dataset consists of idioms, similes, and metaphors with matching figurative and literal images, as well as two novel …
The goal for ISIC 2019 is classify dermoscopic images among nine different diagnostic categories.25,331 images are available for training across …
This dataset was presented as part of the ICLR 2023 paper 𝘈 𝘧𝘳𝘢𝘮𝘦𝘸𝘰𝘳𝘬 𝘧𝘰𝘳 𝘣𝘦𝘯𝘤𝘩𝘮𝘢𝘳𝘬𝘪𝘯𝘨 𝘊𝘭𝘢𝘴𝘴-𝘰𝘶𝘵-𝘰𝘧-𝘥𝘪𝘴𝘵𝘳𝘪𝘣𝘶𝘵𝘪𝘰𝘯 𝘥𝘦𝘵𝘦𝘤𝘵𝘪𝘰𝘯 𝘢𝘯𝘥 𝘪𝘵𝘴 𝘢𝘱𝘱𝘭𝘪𝘤𝘢𝘵𝘪𝘰𝘯 …
Dataset Introduction In this work, we introduce the In-Diagram Logic (InDL) dataset, an innovative resource crafted to rigorously evaluate the …
This data set comprises 22 fundus images with their corresponding manual annotations for the blood vessels, separated as arteries and …
The Liver-US dataset is a comprehensive collection of high-quality ultrasound images of the liver, including both normal and abnormal cases. …
The minimalist histopathology image analysis dataset (MHIST) is a binary classification dataset of 3,152 fixed-size images of colorectal polyps, each …
The process by which sections in a document are demarcated and labeled is known as section identification. Such sections are …
MixedWM38 Dataset(WaferMap) has more than 38000 wafer maps, including 1 normal pattern, 8 single defect patterns, and 29 mixed defect …
Early detection of retinal diseases is one of the most important means of preventing partial or permanent blindness in patients. …
A large real-world event-based dataset for object classification. Source: HATS: Histograms of Averaged Time Surfaces for Robust Event-based Object Classification
The N-ImageNet dataset is an event-camera counterpart for the ImageNet dataset. The dataset is obtained by moving an event camera …
The RITE (Retinal Images vessel Tree Extraction) is a database that enables comparative studies on segmentation or classification of arteries …
he RSSCN7 dataset contains satellite images acquired from Google Earth, which is originally collected for remote sensing scene classification. We …
The Recognizing Textual Entailment (RTE) datasets come from a series of textual entailment challenges. Data from RTE1, RTE2, RTE3 and …
The Schema-Guided Dialogue (SGD) dataset consists of over 20k annotated multi-domain, task-oriented conversations between a human and a virtual assistant. …
This dataset is based on the Spiking Heidelberg Digits (SHD) dataset. Sample inputs consist of two spike encoded digits sampled …
The SPOTS-10 dataset is an extensive collection of grayscale images showcasing diverse patterns found in ten animal species. Specifically, SPOTS-10 …
The Stanford Sentiment Treebank is a corpus with fully labeled parse trees that allows for a complete analysis of the …
Sentiment140 is a dataset that allows you to discover the sentiment of a brand, product, or topic on Twitter. Source: …
This dataset consists of computer-generated images for gas leakage segmentation. It features diverse backgrounds, interfering foreground objects, and precise ground …
arxiv : https://arxiv.org/abs/2304.11708 Accepted at 29th International Congress on Sound and Vibration (ICSV29). The drone has been used for various …
Table-ACM12K (TACM12K) is a relational table dataset derived from the ACM heterogeneous graph dataset. It includes four tables: papers, authors, …
Table-LastFm2K (TLF2K) is a relational table dataset derived from the classical LastFM2K dataset. It contains three tables: artists, user_artists, and …
Table-MovieLens1M (TML1M) is a relational table dataset derived from the classical MovieLens1M dataset. It consists of three tables: users, movies, …
The Winograd Schema Challenge was introduced both as an alternative to the Turing Test and as a test of a …
WiC is a benchmark for the evaluation of context-sensitive word embeddings. WiC is framed as a binary classification task. Each …
Enlarge the dataset to understand how image background effect the Computer Vision ML model. With the following topics: Blur Background …
A new face annotation dataset with balanced distribution between genders and ethnic origins. Source: [SensitiveNets: Learning Agnostic Representations with Application …
MORPH is a facial age estimation dataset, which contains 55,134 facial images of 13,617 subjects ranging from 16 to 77 …
The UTKFace dataset is a large-scale face dataset with long age span (range from 0 to 116 years old). The …
Bentham manuscripts refers to a large set of documents that were written by the renowned English philosopher and reformer Jeremy …
Digital Peter is a dataset of Peter the Great's manuscripts annotated for segmentation and text recognition. The dataset may be …
The database is written in Cyrillic and shares the same 33 characters. Besides these characters, the Kazakh alphabet also contains …
The IAM database contains 13,353 images of handwritten lines of text created by 657 writers. The texts those writers transcribed …
The IAM database contains 13,353 images of handwritten lines of text created by 657 writers. The texts those writers transcribed …
Handwritten Text Recognition (HTR) is an open problem at the intersection of Computer Vision and Natural Language Processing. The main …
This dataset arises from the READ project (Horizon 2020). The dataset consists of a subset of documents from the Ratsprotokolle …
This dataset arises from the READ project (Horizon 2020). The dataset consists of a subset of documents from the Ratsprotokolle …
Saint Gall dataset contains handwritten historical manuscripts written in Latin that date back to the 9th century. It consists of …
Dataset aimed to do automated aerial scene classification of disaster events from on-board a UAV. Source: [Deep-Learning-Based Aerial Image Classification …
The dataset contains aerial images containing three commonly occurring natural disasters earthquake/collapsed buildings, flood, wildfire/fire, and a normal class; do …
AmsterTime dataset offers a collection of 2,500 well-curated images matching the same scene from a street view matched to historical …
ArtDL is a novel painting data set for iconography classification composed of images collected from online sources. Most of the …
The Breast Cancer Histopathological Image Classification (BreakHis) is composed of 9,109 microscopic images of breast tumor tissue collected from 82 …
The CIFAR-10 database (Canadian Institute For Advanced Research database) is a large collection of natural color images. It has a …
The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists …
CINIC-10 is a dataset for image classification. It has a total of 270,000 images, 4.5 times that of CIFAR-10. It …
The Caltech-UCSD Birds-200-2011 (CUB-200-2011) dataset is the most widely-used dataset for fine-grained visual categorization task. It contains 11,788 images of …
Caltech-256 is an object recognition dataset containing 30,607 real-world images, of different sizes, spanning 257 classes (256 object classes and …
Update on 3DIdent, where we introduce six additional object classes (Hare, Dragon, Cow, Armadillo, Horse, and Head), and impose a …
Chaoyang dataset contains 1111 normal, 842 serrated, 1404 adenocarcinoma, 664 adenoma, and 705 normal, 321 serrated, 840 adenocarcinoma, 273 adenoma …
Clothing1M contains 1M clothing images in 14 classes. It is a dataset with noisy labels, since the data is collected …
ColonINST is a large-scale instruction tuning dataset designed for multimodal analysis in colonoscopy. This dataset comprises 62 categories, 303,001 colonoscopy …
ColonINST is a large-scale instruction tuning dataset designed for multimodal analysis in colonoscopy. This dataset comprises 62 categories, 303,001 colonoscopy …
This is a dataset with spurious correlations which can be used to evaluate machine learning methods for out-of-distribution generalization, causal …
Danish Fungi 2020 (DF20) is a fine-grained dataset and benchmark. The dataset, constructed from observations submitted to the Danish Fungal …
Danish Fungi 2020 (DF20) is a novel fine-grained dataset and benchmark. The dataset, constructed from observations submitted to the Danish …
The Describable Textures Dataset (DTD) contains 5640 texture images in the wild. They are annotated with human-centric attributes inspired by …
Comprises 11 hand gesture categories from 29 subjects under 3 illumination conditions. Source: [A Low Power, Fully Event-Based Gesture Recognition …
The ESC-50 dataset is a labeled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification. …
Eurosat is a dataset and deep learning benchmark for land use and land cover classification. The dataset is based on …
A SAR version of the EuroSAT dataset. The images were collected from Sentinel-1 GRD products (two bands VV and VH) …
See paper: Caldas, Sebastian, et al. "Leaf: A benchmark for federated settings." arXiv preprint arXiv:1812.01097 (2018).
Sharan, Lavanya, Ruth Rosenholtz, and Edward Adelson. "Material perception: What can you see in a brief glance?." Journal of Vision …
Fashion-MNIST is a dataset comprising of 28×28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per …
Object detection benchmark for logo detection. Images are natural scenes. Each image contains multiple objects, and each image has a …
The Food-101 dataset consists of 101 food categories with 750 training and 250 test images per category, making a total …
The Food-101N dataset is introduced in "CleanNet: Transfer Learning for Scalable Image Training with Label Noise (CVPR'18). It is an …
The German Traffic Sign Recognition Benchmark (GTSRB) contains 43 classes of traffic signs, split into 39,209 training images and 12,630 …
Four pathologists from Longhua Hospital Shanghai University of Traditional Chinese Medicine provide 600 images of gastric cancer pathology images at …
We construct Gaze-CIFAR-10, a gaze-augmented image dataset based on the standard CIFAR-10 benchmark, enhanced with human eye-tracking annotations collected using …
After defining a taxonomy of the main stone deterioration patterns and anomalies, we selected 354 highly representative images of stone-built …
The ImageNet dataset contains 14,197,122 annotated images according to the WordNet hierarchy. Since 2010 the dataset is used in the …
This split was introduced in TEMI (BMVC 2023) Adaloglou, Nikolas, Felix Michels, Hamza Kalisch, and Markus Kollmann. "Exploring the Limits …
Imagenet32 is a huge dataset made up of small images called the down-sampled version of Imagenet. Imagenet32 is composed of …
Imagenet64 is a massive dataset of small images called the down-sampled version of Imagenet. Imagenet64 comprises 1,281,167 training data and …
ImageNet-9 consists of images with different amounts of background and foreground signal, which you can use to measure the extent …
ImageNet-P consists of noise, blur, weather, and digital distortions. The dataset has validation perturbations; has difficulty levels; has CIFAR-10, Tiny …
ImageNet-Sketch data set consists of 50,889 images, approximately 50 images for each of the 1000 ImageNet classes. The data set …
Imagenette is a subset of 10 easily classified classes from Imagenet (bench, English springer, cassette player, chain saw, church, French …
Context This is image data of Natural Scenes around the world. Content This Data contains around 25k images of size …
JFT-300M is an internal Google dataset used for training image classification models. Images are labeled using an algorithm that uses …
The KTH-TIPS (Textures under varying Illumination, Pose and Scale) image database was created to extend the CUReT database in two …
Kuzushiji-MNIST is a drop-in replacement for the MNIST dataset (28x28 grayscale, 70,000 images). Since MNIST restricts us to 10 classes, …
The KVASIR Dataset was released as part of the medical multimedia challenge presented by MediaEval. It is based on images …
The LIMUC dataset is the largest publicly available labeled ulcerative colitis dataset that compromises 11276 images from 564 patients and …
LabelMe database is a large collection of images with ground truth labels for object detection and recognition. The annotations come …
It is composed of around 770k of color 256x256 RGB images extracted from the European Union Intellectual Property Office (EUIPO) …
The PlantVillage dataset, with over 54,000 images spanning 14 plant species and 26 disease types, has been widely used for …
The MAMe dataset contains images of high-resolution and variable shape of artworks from 3 different museums: - The Metropolitan Museum …
The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It has …
The dataset contains a total of 27,558 cell images with equal instances of parasitized and uninfected cells. Source: Malaria Dataset
The MultiMNIST dataset is generated from MNIST. The training and tests are generated by overlaying a digit on top of …
The Neuromorphic-Caltech101 (N-Caltech101) dataset is a spiking version of the original frame-based Caltech101 dataset. The original dataset contained both a …
Brief Description The Neuromorphic-MNIST (N-MNIST) dataset is a spiking version of the original frame-based MNIST dataset. It consists of the …
The NCT-CRC-HE-100K dataset is a set of 100,000 non-overlapping image patches extracted from 86 H$\&$E stained human cancer tissue slides …
ObjectNet is a test set of images collected directly using crowd-sourcing. ObjectNet is unique as the objects are captured at …
Omni-Realm Benchmark (OmniBenchmark) is a diverse (21 semantic realm-wise datasets) and concise (realm-wise datasets have no concepts overlapping) benchmark for …
We introduce the Oracle-MNIST dataset, comprising of 2828 grayscale images of 30,222 ancient characters from 10 categories, for benchmarking pattern …
The Oxford-IIIT Pet Dataset has 37 categories with roughly 200 images for each class. The images have a large variations …
The Oxford-IIIT Pet Dataset is a 37-category pet dataset with roughly 200 images for each class. The images have large …
PASCAL VOC 2007 is a dataset for image recognition. The twenty object classes that have been selected are: Person: person …
The Prima head pose dataset consists of 2790 images of 15 persons recorded twice. Pitch values lie in the interval …
The Places205 dataset is a large-scale scene-centric dataset with 205 common scene categories. The training dataset contains around 2,500,000 images …
The Places365 dataset is a scene recognition dataset. It is composed of 10 million images comprising 434 scene classes. There …
PlantDoc is a dataset for visual plant disease detection. The dataset contains 2,598 data points in total across 13 plant …
The PlantVillage dataset consists of 54303 healthy and unhealthy leaf images divided into 38 categories by species and disease.
The exact pre-processing steps used to construct the MNIST dataset have long been lost. This leaves us with no reliable …
RESISC45 dataset is a dataset for Remote Sensing Image Scene Classification (RESISC). It contains 31,500 RGB images of size 256×256 …
Part of the Controlled Noisy Web Labels Dataset.
Part of the Controlled Noisy Web Labels Dataset.
Part of the Controlled Noisy Web Labels Dataset.
The STL-10 is an image dataset derived from ImageNet and popularly used to evaluate algorithms of unsupervised feature learning or …
The Scene UNderstanding (SUN) database contains 899 categories and 130,519 images. There are 397 well-sampled categories to evaluate numerous state-of-the-art …
Street View House Numbers (SVHN) is a digit classification benchmark dataset that contains 600,000 32×32 RGB images of printed digits …
So2Sat LCZ42 consists of local climate zone (LCZ) labels of about half a million Sentinel-1 and Sentinel-2 image patches in …
The Stanford Cars dataset consists of 196 classes of cars with a total of 16,185 images, taken from the rear. …
Stanford Online Products (SOP) dataset has 22,634 classes with 120,053 product images. The first 11,318 classes (59,551 images) are split …
Visual Wake Words represents a common microcontroller vision use-case of identifying whether a person is present in the image or …
Our goal is to improve upon the status quo for designing image classification models trained in one domain that perform …
The WebVision dataset is designed to facilitate the research on learning visual representation from noisy web data. It is a …
The iNaturalist 2017 dataset (iNat) contains 675,170 training and validation images from 5,089 natural fine-grained categories. Those categories belong to …
The iWildCam2020-WILDS dataset is a variant of the iWildCam 2020 dataset. iWildCam2020-WILDS is a benchmark dataset designed to test OOD …
The smallNORB dataset is a datset for 3D object recognition from shape. It contains images of 50 toys belonging to …
SUDO is a benchmark of 50 real-world malicious tasks designed to evaluate LLM-based computer agents in live desktop and web …
CNN/Daily Mail is a dataset for text summarization. Human generated abstractive summary bullets were generated from news stories in CNN …
COCO Captions contains over one and a half million captions describing over 330,000 images. For the training and validation images, …
CSL is a synthetic dataset introduced in Murphy et al. (2019) to test the expressivity of GNNs. In particular, graphs …
CommonGen is constructed through a combination of crowdsourced and existing caption corpora, consists of 79k commonsense descriptions over 35k unique …
Czech restaurant information is a dataset for NLG in task-oriented spoken dialogue systems with Czech as the target language. It …
DART is a large dataset for open-domain structured data record to text generation. DART consists of 82,191 examples across different …
DailyDialog is a high-quality multi-turn open-domain English dialog dataset. It contains 13,118 dialogues split into a training set with 11,118 …
Paper | Github | Dataset| Model As a part of our research efforts toward making LLMs more safe for public …
LCSTS is a large corpus of Chinese short text summarization dataset constructed from the Chinese microblogging website Sina Weibo, which …
OpenWebText is an open-source recreation of the WebText corpus. The text is web content extracted from URLs shared on Reddit …
ROCStories is a collection of commonsense short stories. The corpus consists of 100,000 five-sentence stories. Each story logically follows everyday …
ReDial (Recommendation Dialogues) is an annotated dataset of dialogues, where users recommend movies to each other. The dataset consists of …
The SciQ dataset contains 13,679 crowdsourced science exam questions about Physics, Chemistry and Biology, among others. The questions are in …