Machine Learning Benchmarks

Browse 15 benchmarks across 9 tasks
← ML Research Wiki / Benchmarks / Playing Games
Clear
Browse by Category

10-shot image generation

FQL-Driving

FQL-driving

📊 1 results
📏 Metrics: 0-shot MRR

FlyingThings3D

FlyingThings3D is a synthetic dataset for optical flow, disparity and scene flow estimation. It consists of everyday objects flying along …

📊 1 results
📏 Metrics: 0..5sec

MEAD

Multi-view Emotional Audio-visual Dataset

📊 1 results
📏 Metrics: 12k

Music21

Music21 is an untrimmed video dataset crawled by keyword query from Youtube. It contains music performances belonging to 21 categories. …

📊 1 results
📏 Metrics: 0..5sec

16k

ConceptNet

ConceptNet is a knowledge graph that connects words and phrases of natural language with labeled edges. Its knowledge is collected …

📊 1 results
📏 Metrics: 1'"

3D Face Animation

BEAT2

We propose EMAGE, a framework to generate full-body human gestures from audio and masked gestures, encompassing facial, local body, hands, …

📊 5 results
📏 Metrics: MSE

Biwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2

BIWI 3D corpus comprises a total of 1109 sentences uttered by 14 native English speakers (6 males and 8 females). …

📊 5 results
📏 Metrics: Lip Vertex Error, FDD

VOCASET

VOCASET is a 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio. …

📊 2 results
📏 Metrics: Lip Vertex Error

3D Face Modelling

Voxceleb-3D

A dataset for voice and 3D face structure study. It contains about 1.4K identities with their 3D face models and …

📊 2 results
📏 Metrics: Mean ARE, ARE-ER, ARE-FR, ARE-MR, ARE-CR

Multi-agent Reinforcement Learning

SMAC-Exp

The StarCraft Multi-Agent Challenges+ requires agents to learn completion of multi-stage tasks and usage of environmental factors without precise reward …

📊 1 results
📏 Metrics: Median Win Rate

Offline RL

D4RL

D4RL is a collection of environments for offline reinforcement learning. These environments include Maze2D, AntMaze, Adroit, Gym, Flow, FrankKitchen and …

📊 3 results
📏 Metrics: Average Reward

Playing the Game of 2048

The Game of 2048

The 2048 game task involves training an agent to achieve high scores in the game 2048 (Wikipedia)

📊 2 results
📏 Metrics: Average Score

Reinforcement Learning (Atari Games)

Seaquest - OpenAI Gym

Dataset: The experiments are conducted using the Seaquest environment from the OpenAI Gym framework, which simulates the Atari 2600 game …

📊 1 results
📏 Metrics: Average Return

Trajectory Planning

ToolBench

ToolBench is an instruction-tuning dataset for tool use, which is created automatically using ChatGPT. Specifically, the authors collect 16,464 real-world …

📊 3 results
📏 Metrics: Win rate

nuScenes

The nuScenes dataset is a large-scale autonomous driving dataset. The dataset has 3D bounding boxes for 1000 scenes collected in …

📊 4 results
📏 Metrics: Collision-3s, L2-3s, Collision-1s, Collision-2s, Collision-Avg, L2-1s, L2-2s, L2-Avg