ML Research Wiki / Benchmarks / Image Retrieval / COCO (Common Objects in Context)

COCO (Common Objects in Context)

Image Retrieval Benchmark

Performance Over Time

📊 Showing 6 results | 📏 Metric: recall@1

Top Performing Models

Rank Model Paper recall@1 Date Code
1 Oscar Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks 98.30 2020-04-13 📦 rmokady/clip_prefix_caption 📦 microsoft/Oscar 📦 milvlg/rosita 📦 ThanThoai/Visual-Question-Answering_Vietnamese
2 BLIP-2 ViT-G (fine-tuned) BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models 68.30 2023-01-30 📦 huggingface/transformers 📦 salesforce/lavis 📦 thudm/visualglm-6b
3 VisualSparta VisualSparta: An Embarrassingly Simple Approach to Large-scale Text-to-Image Search with Weighted Bag-of-words 68.20 2021-01-01 📦 soco-ai/SF-QA
4 BLIP-2 ViT-L (fine-tuned) BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models 66.30 2023-01-30 📦 huggingface/transformers 📦 salesforce/lavis 📦 thudm/visualglm-6b
5 FLAVA (zero-shot) FLAVA: A Foundational Language And Vision Alignment Model 38.38 2021-12-08 📦 facebookresearch/multimodal 📦 apsdehal/flava-tutorials 📦 social-ai-studio/matk 📦 2024-MindSpore-1/Code2
6 CLIP (zero-shot) FLAVA: A Foundational Language And Vision Alignment Model 33.29 2021-12-08 📦 facebookresearch/multimodal 📦 apsdehal/flava-tutorials 📦 social-ai-studio/matk 📦 2024-MindSpore-1/Code2

All Papers (6)