ML Research Wiki / Benchmarks / Long-Context Understanding / MMNeedle

MMNeedle

Long-Context Understanding Benchmark

Performance Over Time

📊 Showing 11 results | 📏 Metric: 1 Image, 4*4 Stitching, Exact Accuracy

Top Performing Models

Rank Model Paper 1 Image, 4*4 Stitching, Exact Accuracy Date Code
1 GPT-4o GPT-4 Technical Report 94.60 2023-03-15 📦 openai/evals 📦 shmsw25/factscore 📦 unispac/visual-adversarial-examples-jailbreak-large-language-models
2 Gemini Pro 1.5 Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context 90.34 2024-03-08 📦 dlvuldet/primevul
3 GPT-4V GPT-4 Technical Report 86.09 2023-03-15 📦 openai/evals 📦 shmsw25/factscore 📦 unispac/visual-adversarial-examples-jailbreak-large-language-models
4 LLaVA-Llama-3 LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images 43.80 2024-03-18 📦 thunlp/llava-uhd
5 Gemini Pro 1.0 Gemini: A Family of Highly Capable Multimodal Models 29.53 2023-12-19 📦 valdecy/pybibx
6 IDEFICS2-8B What matters when building vision-language models? 18.90 2024-05-03 -
7 CogVLM2-Llama-3 CogVLM: Visual Expert for Pretrained Language Models 7.30 2023-11-06 📦 thudm/cogvlm 📦 THUDM/CogAgent 📦 2024-MindSpore-1/Code2 📦 MS-P3/code5
8 InstructBLIP-Flan-T5-XXL InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning 3.80 2023-05-11 📦 salesforce/lavis 📦 tabtoyou/kollava 📦 pwc-1/Paper-9 📦 MS-P3/code3
9 mPLUG-Owl-v2 mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration 1.90 2023-11-07 📦 x-plug/mplug-owl 📦 X-PLUG/mPLUG-Owl
10 CogVLM-17B CogVLM: Visual Expert for Pretrained Language Models 0.00 2023-11-06 📦 thudm/cogvlm 📦 THUDM/CogAgent 📦 2024-MindSpore-1/Code2 📦 MS-P3/code5

All Papers (11)