|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- robotics |
|
|
- reinforcement-learning |
|
|
- imitation-learning |
|
|
- diffusion-policy |
|
|
- flow-matching |
|
|
- robomimic |
|
|
- mujoco |
|
|
language: |
|
|
- en |
|
|
library_name: pytorch |
|
|
pipeline_tag: reinforcement-learning |
|
|
--- |
|
|
|
|
|
# DMPO Pretrained Checkpoints |
|
|
|
|
|
Pretrained checkpoints for **DMPO: Dispersive MeanFlow Policy Optimization**. |
|
|
|
|
|
[](http://arxiv.org/abs/2601.20701) |
|
|
[](https://github.com/Guowei-Zou/dmpo-release) |
|
|
[](https://guowei-zou.github.io/dmpo-page/) |
|
|
|
|
|
## Overview |
|
|
|
|
|
DMPO enables **true one-step generation** for real-time robotic control via MeanFlow, dispersive regularization, and RL fine-tuning. These checkpoints can be used directly for fine-tuning with PPO. |
|
|
|
|
|
## Checkpoint Structure |
|
|
|
|
|
``` |
|
|
pretrained_checkpoints/ |
|
|
βββ DMPO_pretrained_gym_checkpoints/ |
|
|
β βββ gym_improved_meanflow/ # MeanFlow without dispersive loss |
|
|
β βββ gym_improved_meanflow_dispersive/ # MeanFlow with dispersive loss (recommended) |
|
|
βββ DMPO_pretraining_robomimic_checkpoints/ |
|
|
βββ w_0p1/ # dispersive weight = 0.1 |
|
|
βββ w_0p5/ # dispersive weight = 0.5 (recommended) |
|
|
βββ w_0p9/ # dispersive weight = 0.9 |
|
|
``` |
|
|
|
|
|
## Supported Tasks |
|
|
|
|
|
| Domain | Tasks | |
|
|
|--------|-------| |
|
|
| OpenAI Gym | hopper, walker2d, ant, humanoid, kitchen-* | |
|
|
| Robomimic (RGB) | lift, can, square, transport | |
|
|
|
|
|
## Usage |
|
|
|
|
|
Use the `hf://` prefix in config files to auto-download: |
|
|
|
|
|
```yaml |
|
|
# Gym tasks |
|
|
base_policy_path: hf://pretrained_checkpoints/DMPO_pretrained_gym_checkpoints/gym_improved_meanflow_dispersive/hopper-medium-v2_best.pt |
|
|
|
|
|
# Robomimic tasks |
|
|
base_policy_path: hf://pretrained_checkpoints/DMPO_pretraining_robomimic_checkpoints/w_0p5/can/can_w0p5_08_meanflow_dispersive.pt |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{zou2026stepenoughdispersivemeanflow, |
|
|
title={One Step Is Enough: Dispersive MeanFlow Policy Optimization}, |
|
|
author={Guowei Zou and Haitao Wang and Hejun Wu and Yukun Qian and Yuhang Wang and Weibing Li}, |
|
|
year={2026}, |
|
|
eprint={2601.20701}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.RO}, |
|
|
url={https://arxiv.org/abs/2601.20701}, |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License |
|
|
|