ML Research Wiki / Benchmarks / Zero-Shot Transfer Image Classification / ImageNet

ImageNet

Zero-Shot Transfer Image Classification Benchmark

Performance Over Time

📊 Showing 20 results | 📏 Metric: Param

Top Performing Models

Rank Model Paper Param Date Code
1 M2-Encoder 📚 M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining 88.50 2024-01-29 📦 alipay/Ant-Multi-Modal-Framework
2 CoCa 📚 CoCa: Contrastive Captioners are Image-Text Foundation Models 86.30 2022-05-04 📦 mlfoundations/open_clip 📦 facebookresearch/multimodal 📦 lucidrains/CoCa-pytorch
3 LiT-22B Scaling Vision Transformers to 22 Billion Parameters 85.90 2023-02-10 📦 lucidrains/flash-cosine-sim-attention
4 BASIC 📚 Combined Scaling for Zero-shot Transfer Learning 85.70 2021-11-19 -
5 LiT ViT-e PaLI: A Jointly-Scaled Multilingual Language-Image Model 85.40 2022-09-14 📦 google-research/big_vision
6 LiT-tuning LiT: Zero-Shot Transfer with Locked-image text Tuning 84.50 2021-11-15 📦 mlfoundations/open_clip 📦 google-research/vision_transformer 📦 google-research/big_vision 📦 laion-ai/clip_benchmark 📦 eify/clip_benchmark
7 IMP-MoE-L Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception 83.90 2023-05-10 -
8 EVA-CLIP-18B EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters 83.80 2024-02-06 📦 baaivision/EVA 📦 baaivision/eva
9 InternVL-C InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks 83.20 2023-12-21 📦 opengvlab/internvl 📦 opengvlab/internvl-mmdetseg
10 MAWS (ViT-2B) The effectiveness of MAE pre-pretraining for billion-scale pretraining 82.10 2023-03-23 📦 facebookresearch/maws

All Papers (20)