MM-Vet

Dataset Information
Modalities
Images, Texts
Languages
English
Introduced
2023
License
Homepage

Overview

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities

Variants: MM-Vet

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Visual Question Answering Janus-Pro-7B Janus-Pro: Unified Multimodal Understanding and … 2025-01-29
Visual Question Answering Janus-Pro-1B Janus-Pro: Unified Multimodal Understanding and … 2025-01-29
Visual Question Answering PIIP-LLaVA (Vicuna-7B, ConvNeXt-L, CLIP-L ) Parameter-Inverted Image Pyramid Networks for … 2025-01-14
Visual Question Answering Lyra-Mini Lyra: An Efficient and Speech-Centric … 2024-12-12
Visual Question Answering (VQA) Lyra-Pro Lyra: An Efficient and Speech-Centric … 2024-12-12
Visual Question Answering Lyra-Base Lyra: An Efficient and Speech-Centric … 2024-12-12
Visual Question Answering Lyra-Pro Lyra: An Efficient and Speech-Centric … 2024-12-12
Visual Question Answering ILLUME ILLUME: Illuminating Your LLMs to … 2024-12-09
Visual Question Answering LLaVA-1.5-7B (VG-S) ProVision: Programmatically Scaling Vision-centric Instruction … 2024-12-09
Visual Question Answering LLaVA-1.5-7B (DC-S) ProVision: Programmatically Scaling Vision-centric Instruction … 2024-12-09
Visual Question Answering TACO (LLaMA3-8B / CLIP) TACO: Learning Multi-modal Action Models … 2024-12-07
Visual Question Answering TACO (Qwen2-7B / SigLIP) TACO: Learning Multi-modal Action Models … 2024-12-07
Visual Question Answering TACO (LLaMA3-8B / SigLIP) TACO: Learning Multi-modal Action Models … 2024-12-07
Visual Question Answering InternVL2.5-38B Expanding Performance Boundaries of Open-Source … 2024-12-06
Visual Question Answering InternVL2.5-2B Expanding Performance Boundaries of Open-Source … 2024-12-06
Visual Question Answering InternVL2.5-4B Expanding Performance Boundaries of Open-Source … 2024-12-06
Visual Question Answering InternVL2.5-1B Expanding Performance Boundaries of Open-Source … 2024-12-06
Visual Question Answering InternVL2.5-8B Expanding Performance Boundaries of Open-Source … 2024-12-06
Visual Question Answering MAmmoTH-VL-8B MAmmoTH-VL: Eliciting Multimodal Reasoning with … 2024-12-06
Visual Question Answering InternVL2.5-26B Expanding Performance Boundaries of Open-Source … 2024-12-06

Research Papers

Recent papers with results on this dataset: