ML Research Wiki / Benchmarks / Video Question Answering / OVBench

OVBench

Video Question Answering Benchmark

Performance Over Time

📊 Showing 15 results | 📏 Metric: AVG

Top Performing Models

Rank Model Paper AVG Date Code
1 Seed1.5-VL Seed1.5-VL Technical Report 60.00 2025-05-11 -
2 VideoChat-Online (4B) Online Video Understanding: OVBench and VideoChat-Online 54.90 2024-12-31 📦 MCG-NJU/VideoChat-Online
3 Gemini-1.5-Flash Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context 50.70 2024-03-08 📦 dlvuldet/primevul
4 Qwen2-VL (7B) Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution 49.70 2024-09-18 📦 qwenlm/qwen2-vl 📦 qwenlm/qwen2.5-vl 📦 juruobenruo/DexVLA
5 LLaVA-OneVision (7B) LLaVA-OneVision: Easy Visual Task Transfer 49.50 2024-08-06 📦 evolvinglmms-lab/lmms-eval 📦 MindSpore-scientific-2/code-14
6 InternVL2 (7B) Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling 48.70 2024-12-06 📦 opengvlab/internvl
7 InternVL2 (4B) Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling 44.10 2024-12-06 📦 opengvlab/internvl
8 LongVA (7B) Long Context Transfer from Language to Vision 43.60 2024-06-24 📦 jzhang38/EasyContext 📦 evolvinglmms-lab/longva
9 LLaMA-VID (7B) LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models 41.90 2023-11-28 📦 lastmile-ai/aiconfig 📦 dvlab-research/llama-vid
10 VTimeLLM (7B) VTimeLLM: Empower LLM to Grasp Video Moments 33.10 2023-11-30 📦 huangb23/vtimellm

All Papers (15)