CALVIN

Composing Actions from Language and Vision

Dataset Information
Introduced
2021
License
MIT License
Homepage

Overview

CALVIN (Composing Actions from Language and Vision), is an open-source simulated benchmark to learn long-horizon language-conditioned robot manipulation tasks.

Variants: CALVIN

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Robot Manipulation DreamVLA DreamVLA: A Vision-Language-Action Model Dreamed … 2025-07-06
Robot Manipulation UniVLA UniVLA: Learning to Act Anywhere … 2025-05-09
Robot Manipulation Openhelix OpenHelix: A Short Survey, Empirical … 2025-05-06
Robot Manipulation UP-VLA UP-VLA: A Unified Understanding and … 2025-01-31
Robot Manipulation VPP Video Prediction Policy: A Generalist … 2024-12-19
Robot Manipulation RoboVLMs Towards Generalist Robot Policies: What … 2024-12-18
Robot Manipulation MoDE Efficient Diffusion Transformer Policies with … 2024-12-17
Zero-shot Generalization MoDE Efficient Diffusion Transformer Policies with … 2024-12-17
Robot Manipulation VidMan VidMan: Exploiting Implicit Dynamics from … 2024-11-14
Robot Manipulation RoboDual Towards Synergistic, Generalized, and Efficient … 2024-10-10
Zero-shot Generalization GR-MG GR-MG: Leveraging Partially Annotated Data … 2024-08-26
Robot Manipulation GR-MG GR-MG: Leveraging Partially Annotated Data … 2024-08-26
Zero-shot Generalization RoboUniView RoboUniView: Visual-Language Model with Unified … 2024-06-27
Robot Manipulation RoboUniView RoboUniView: Visual-Language Model with Unified … 2024-06-27
Robot Manipulation OpenVLA OpenVLA: An Open-Source Vision-Language-Action Model 2024-06-13
Robot Manipulation LCB From LLMs to Actions: Latent … 2024-05-08
Robot Manipulation 3D Diffusor Actor 3D Diffuser Actor: Policy Diffusion … 2024-02-18
Zero-shot Generalization 3D Diffuser Actor 3D Diffuser Actor: Policy Diffusion … 2024-02-18
Robot Manipulation 3DDA 3D Diffuser Actor: Policy Diffusion … 2024-02-18
Zero-shot Generalization GR-1 Unleashing Large-Scale Video Generative Pre-training … 2023-12-20

Research Papers

Recent papers with results on this dataset: