ML Research Wiki / Benchmarks / MMR total / MRR-Benchmark

MRR-Benchmark

MMR total Benchmark

Performance Over Time

📊 Showing 13 results | 📏 Metric: Total Column Score

Top Performing Models

Rank Model Paper Total Column Score Date Code
1 GPT-4o 📚 GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding 457.00 2024-06-14 -
2 GPT-4V 📚 The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) 415.00 2023-09-29 📦 qi-zhangyang/gemini-vs-gpt4v 📦 vista-h/gpt-4v_social_media
3 LLaVA-NEXT-34B 📚 Visual Instruction Tuning 412.00 2023-04-17 📦 huggingface/transformers 📦 haotian-liu/LLaVA 📦 LLaVA-VL/LLaVA-NeXT
4 Phi-3-Vision 📚 Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone 397.00 2024-04-22 -
5 InternVL2-8B 📚 InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks 368.00 2023-12-21 📦 opengvlab/internvl 📦 opengvlab/internvl-mmdetseg
6 Qwen-vl-max 📚 Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond 366.00 2023-08-24 📦 qwenlm/qwen-vl 📦 brandon3964/multimodal-task-vector
7 LLaVA-NEXT-13B 📚 Visual Instruction Tuning 335.00 2023-04-17 📦 huggingface/transformers 📦 haotian-liu/LLaVA 📦 LLaVA-VL/LLaVA-NeXT
8 Qwen-vl-plus 📚 Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond 310.00 2023-08-24 📦 qwenlm/qwen-vl 📦 brandon3964/multimodal-task-vector
9 Idefics-2-8B 📚 What matters when building vision-language models? 256.00 2024-05-03 -
10 LLaVA-1.5-13B 📚 Visual Instruction Tuning 243.00 2023-04-17 📦 huggingface/transformers 📦 haotian-liu/LLaVA 📦 LLaVA-VL/LLaVA-NeXT

All Papers (13)