JFT-300M

Dataset Information
Modalities
Images
Languages
Chinese
Introduced
2017
License
Private (not publicly available)

Overview

JFT-300M is an internal Google dataset used for training image classification models. Images are labeled using an algorithm that uses complex mixture of raw web signals, connections between web-pages and user feedback. This results in over one billion labels for the 300M images (a single image can have multiple labels). Of the billion image labels, approximately 375M are selected via an algorithm that aims to maximize label precision of selected images.

Variants: JFT-300M

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Image Classification V-MoE-H/14 (Every-2) Scaling Vision with Sparse Mixture … 2021-06-10
Image Classification V-MoE-H/14 (Last-5) Scaling Vision with Sparse Mixture … 2021-06-10
Image Classification V-MoE-L/16 (Every-2) Scaling Vision with Sparse Mixture … 2021-06-10
Image Classification VIT-H/14 Scaling Vision with Sparse Mixture … 2021-06-10

Research Papers

Recent papers with results on this dataset: