ML Research Wiki / Benchmarks / Vision and Language Navigation / Touchdown Dataset

Touchdown Dataset

Vision and Language Navigation Benchmark

Performance Over Time

📊 Showing 12 results | 📏 Metric: Task Completion (TC)

Top Performing Models

Rank	Model	Paper	Task Completion (TC)	Date	Code
1	FLAME	FLAME: Learning to Navigate with Multimodal LLM in Urban Environments	40.20	2024-08-20	📦 xyz9911/FLAME
2	ORAR + junction type + heading delta	Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas	29.10	2022-03-25	📦 raphael-sch/map2seq_vln
3	ORAR	Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas	24.20	2022-03-25	📦 raphael-sch/map2seq_vln
4	ARC + L2STOP	Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation	16.68	2020-09-28	-
5	VLN Transformer +M-50 +style 📚	Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation	16.20	2020-07-01	📦 VegB/VLN-Transformer
6	VLN Transformer	Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation	14.90	2020-07-01	📦 VegB/VLN-Transformer
7	ARC	Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation	14.13	2020-09-28	-
8	Retouch-RConcat	Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View	12.80	2020-01-10	📦 clic-lab/touchdown 📦 lil-lab/touchdown 📦 google-research/valan 📦 VegB/VLN-Transformer
9	Gated Attention (GA)	Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation	11.90	2020-07-01	📦 VegB/VLN-Transformer
10	RConcat	Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation	11.80	2020-07-01	📦 VegB/VLN-Transformer

All Papers (12)

FLAME: Learning to Navigate with Multimodal LLM in Urban Environments

2024

FLAME

xyz9911/FLAME

Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas

2022

ORAR + junction type + heading delta

raphael-sch/map2seq_vln

Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas

2022

ORAR

raphael-sch/map2seq_vln

Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation

2020

ARC + L2STOP

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

2020

VLN Transformer +M-50 +style

VegB/VLN-Transformer

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

2020

VLN Transformer

VegB/VLN-Transformer

Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation

2020

ARC

Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View

2020

Retouch-RConcat

clic-lab/touchdown lil-lab/touchdown

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

2020

Gated Attention (GA)

VegB/VLN-Transformer

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

2020

RConcat

VegB/VLN-Transformer

Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

2018

RConcat

lil-lab/touchdown lil-lab/ciff

Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

2018

Gated Attention (GA)

lil-lab/touchdown lil-lab/ciff

Touchdown Dataset

Performance Over Time

Edit Benchmark Results

Edit Result

Top Performing Models

All Papers (12)

FLAME: Learning to Navigate with Multimodal LLM in Urban Environments

Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas

Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas

Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation

Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

Model	Paper	Task Completion (TC)	Date
FLAME	FLAME: Learning to Navigate with Multimodal LLM i…	40.20	2024-08-20
ORAR + junction type + heading delta	Analyzing Generalization of Vision and Language N…	29.10	2022-03-25
ORAR	Analyzing Generalization of Vision and Language N…	24.20	2022-03-25
ARC + L2STOP	Learning to Stop: A Simple yet Effective Approach…	16.68	2020-09-28
VLN Transformer +M-50 +style	Multimodal Text Style Transfer for Outdoor Vision…	16.20	2020-07-01
VLN Transformer	Multimodal Text Style Transfer for Outdoor Vision…	14.90	2020-07-01
ARC	Learning to Stop: A Simple yet Effective Approach…	14.13	2020-09-28
Retouch-RConcat	Retouchdown: Adding Touchdown to StreetLearn as a…	12.80	2020-01-10
Gated Attention (GA)	Multimodal Text Style Transfer for Outdoor Vision…	11.90	2020-07-01
RConcat	Multimodal Text Style Transfer for Outdoor Vision…	11.80	2020-07-01
RConcat	Touchdown: Natural Language Navigation and Spatia…	10.70	2018-11-29
Gated Attention (GA)	Touchdown: Natural Language Navigation and Spatia…	5.50	2018-11-29