File size: 2,420 Bytes
a381864
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
license: mit
tags:
  - robotics
  - reinforcement-learning
  - imitation-learning
  - diffusion-policy
  - flow-matching
  - robomimic
  - mujoco
language:
  - en
library_name: pytorch
pipeline_tag: reinforcement-learning
---

# DMPO Pretrained Checkpoints

Pretrained checkpoints for **DMPO: Dispersive MeanFlow Policy Optimization**.

[![Paper](https://img.shields.io/badge/arXiv-2601.20701-B31B1B)](http://arxiv.org/abs/2601.20701)
[![Code](https://img.shields.io/badge/GitHub-dmpo--release-blue)](https://github.com/Guowei-Zou/dmpo-release)
[![Project Page](https://img.shields.io/badge/Project-Page-4285F4)](https://guowei-zou.github.io/dmpo-page/)

## Overview

DMPO enables **true one-step generation** for real-time robotic control via MeanFlow, dispersive regularization, and RL fine-tuning. These checkpoints can be used directly for fine-tuning with PPO.

## Checkpoint Structure

```
pretrained_checkpoints/
β”œβ”€β”€ DMPO_pretrained_gym_checkpoints/
β”‚   β”œβ”€β”€ gym_improved_meanflow/              # MeanFlow without dispersive loss
β”‚   └── gym_improved_meanflow_dispersive/   # MeanFlow with dispersive loss (recommended)
└── DMPO_pretraining_robomimic_checkpoints/
    β”œβ”€β”€ w_0p1/                              # dispersive weight = 0.1
    β”œβ”€β”€ w_0p5/                              # dispersive weight = 0.5 (recommended)
    └── w_0p9/                              # dispersive weight = 0.9
```

## Supported Tasks

| Domain | Tasks |
|--------|-------|
| OpenAI Gym | hopper, walker2d, ant, humanoid, kitchen-* |
| Robomimic (RGB) | lift, can, square, transport |

## Usage

Use the `hf://` prefix in config files to auto-download:

```yaml
# Gym tasks
base_policy_path: hf://pretrained_checkpoints/DMPO_pretrained_gym_checkpoints/gym_improved_meanflow_dispersive/hopper-medium-v2_best.pt

# Robomimic tasks
base_policy_path: hf://pretrained_checkpoints/DMPO_pretraining_robomimic_checkpoints/w_0p5/can/can_w0p5_08_meanflow_dispersive.pt
```

## Citation

```bibtex
@misc{zou2026stepenoughdispersivemeanflow,
      title={One Step Is Enough: Dispersive MeanFlow Policy Optimization},
      author={Guowei Zou and Haitao Wang and Hejun Wu and Yukun Qian and Yuhang Wang and Weibing Li},
      year={2026},
      eprint={2601.20701},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2601.20701},
}
```

## License

MIT License