← ML Research Wiki / 2303.04137

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burchfiel, Russ Tedrake, Shuran Song (2023)

Paper Information

arXiv ID

2303.04137

Venue

Robotics: Science and Systems

Domain

robotics, machine learning

SOTA Claim

Yes

Code

Available

Reproducibility

8/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot's visuomotor policy as a conditional denoising diffusion process.We benchmark Diffusion Policy across 15 different tasks from 4 different robot manipulation benchmarks and find that it consistently outperforms existing state-of-the-art robot learning methods with an average improvement of 46.9%.Diffusion Policy learns the gradient of the action-distribution score function and iteratively optimizes with respect to this gradient field during inference via a series of stochastic Langevin dynamics steps.We find that the diffusion formulation yields powerful advantages when used for robot policies, including gracefully handling multimodal action distributions, being suitable for high-dimensional action spaces, and exhibiting impressive training stability.To fully unlock the potential of diffusion models for visuomotor policy learning on physical robots, this paper presents a set of key technical contributions including the incorporation of receding horizon control, visual conditioning, and the time-series diffusion transformer.We hope this work will help motivate a new generation of policy learning techniques that are able to leverage the powerful generative modeling capabilities of diffusion models.Code, data, and training details is available diffusion-policy.cs.columbia.edu

Summary

This paper presents Diffusion Policy, a novel framework for generating robot behavior through a conditional denoising diffusion process. It benchmarks the model across 15 tasks from 4 robot manipulation benchmarks, demonstrating a significant performance improvement of 46.9% over existing state-of-the-art methods. The Diffusion Policy leverages stochastic Langevin dynamics for optimizing the action-distribution gradient, allowing it to handle multimodal action distributions effectively, operate in high-dimensional action spaces, and maintain training stability. Key contributions include integrating receding horizon control, visual conditioning, and a new time-series diffusion transformer architecture for improved action prediction. The paper emphasizes the effectiveness of diffusion models for visuomotor policy learning and provides extensive experimental evaluation, including both simulated and real-world tasks, revealing promising results in complex manipulation scenarios.

Methods

This paper employs the following methods:

Action Diffusion
Stochastic Langevin Dynamics
Receding Horizon Control
Time-Series Diffusion Transformer

Models Used

None specified

Datasets

The following datasets were used in this research:

Robomimic

Evaluation Metrics

Success Rate

Results

Outperforms existing methods with an average improvement of 46.9%
Demonstrated effectiveness across 15 tasks from 4 benchmarks

Limitations

The authors identified the following limitations:

Not specified

Technical Requirements

Number of GPUs: 1
GPU Type: Nvidia 3080

Keywords

diffusion policy visuomotor policy robot manipulation behavior cloning generative models

Papers Using Similar Methods

External Resources

Funding: Not specified
References: 59
Influential Citations: 237

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers