MMBench is a multi-modality benchmark. It methodically develops a comprehensive evaluation pipeline, primarily comprised of two elements. The first element is a meticulously curated dataset that surpasses existing similar benchmarks in terms of the number and variety of evaluation questions and abilities. The second element introduces a novel CircularEval strategy and incorporates the use of ChatGPT. This implementation is designed to convert free-form predictions into pre-defined choices, thereby facilitating a more robust evaluation of the model's predictions.
Variants: MMBench
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Visual Question Answering | LLaVA-InternLM2-ViT + MoSLoRA | Mixture-of-Subspaces in Low-Rank Adaptation | 2024-06-16 |
Visual Question Answering | LLaVA-LLaMA3-8B-ViT + MoSLoRA | Mixture-of-Subspaces in Low-Rank Adaptation | 2024-06-16 |
Visual Question Answering | CuMo-7B | CuMo: Scaling Multimodal LLM with … | 2024-05-09 |
Visual Question Answering | Video-LaVIT | Video-LaVIT: Unified Video-Language Pre-training with … | 2024-02-05 |
Visual Question Answering | DreamLLM-7B | DreamLLM: Synergistic Multimodal Comprehension and … | 2023-09-20 |
Recent papers with results on this dataset: