zengxy0624 commited on
Commit
cc8509f
·
verified ·
1 Parent(s): f045b91

Initial release: DP finetuned on PushT-with-obstacles, 95% success

Browse files
Files changed (3) hide show
  1. README.md +108 -0
  2. config.json +82 -0
  3. model.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: lerobot
4
+ pipeline_tag: robotics
5
+ tags:
6
+ - robotics
7
+ - diffusion-policy
8
+ - imitation-learning
9
+ - pusht
10
+ base_model: lerobot/diffusion_pusht
11
+ ---
12
+
13
+ # Diffusion Policy — PushT with Obstacles
14
+
15
+ Diffusion Policy finetuned from [`lerobot/diffusion_pusht`](https://huggingface.co/lerobot/diffusion_pusht)
16
+ for the PushT manipulation task **with random circular obstacles**: the agent
17
+ pushes a T-shaped block to a goal pose while avoiding 1–3 obstacles per episode.
18
+
19
+ The base model handles standard PushT but has zero obstacle awareness
20
+ (0% success, 55% obstacle-hit rate as zero-shot baseline). Finetuning on
21
+ 101 obstacle-aware demonstrations recovers a working policy.
22
+
23
+ ## Results
24
+
25
+ | Checkpoint | Success Rate | Obstacle-Hit Rate |
26
+ |---|---|---|
27
+ | Base (`lerobot/diffusion_pusht`, zero-shot on obstacle env) | 0% | 55% |
28
+ | **This model** (best of 30k finetune steps) | **95%** | **0%** |
29
+
30
+ Evaluated on `PushTObstacleEnv` with 20 episodes per checkpoint, 300 max steps,
31
+ success threshold 0.95 coverage.
32
+
33
+ > Note: 20 episodes is a noisy estimator (Wilson 95% CI ≈ ±20%). Treat the
34
+ > 95% headline as approximate; a 100-episode re-evaluation is recommended.
35
+
36
+ ## Architecture
37
+
38
+ Inherited from the base model (no architecture changes, only weight finetuning):
39
+
40
+ | Field | Value |
41
+ |---|---|
42
+ | Vision backbone | ResNet-18 |
43
+ | Image input | 3×96×96 (random-cropped to 84×84) |
44
+ | State input | 2 (agent_pos) |
45
+ | Action output | 2 |
46
+ | `n_obs_steps` | 2 |
47
+ | `horizon` | 16 |
48
+ | `n_action_steps` | 8 |
49
+ | Diffusion timesteps | 100 |
50
+ | Parameters | 262,709,026 |
51
+
52
+ ## Training
53
+
54
+ - **Hardware**: 1× NVIDIA H100 (NCSA Delta AI), AMP enabled
55
+ - **Wall time**: ~3 hours for 30k steps
56
+ - **Optimizer**: AdamW, β=(0.95, 0.999), wd=1e-6
57
+ - **LR**: 3e-5 (constant after 100-step warmup)
58
+ - **Batch size**: 64
59
+ - **Dataset**: 101 episodes / 15,758 frames @ 10 fps, recorded with mouse teleop
60
+ in `pusht_obstacle_env.py`
61
+ - **Normalization**: dataset stats recomputed locally (image mean/std differ
62
+ from the base PushT distribution due to obstacle pixels)
63
+
64
+ ## Usage
65
+
66
+ ```python
67
+ from lerobot.policies.diffusion.modeling_diffusion import DiffusionPolicy
68
+
69
+ policy = DiffusionPolicy.from_pretrained("zengxy0624/diffusion-pusht-obstacles")
70
+ policy.eval()
71
+ ```
72
+
73
+ Drop-in compatible with the base model — same input/output schema, just
74
+ swap the repo id.
75
+
76
+ ## Limitations
77
+
78
+ - Only trained on circular obstacles with radius 15 px and 1–3 per episode.
79
+ Out-of-distribution obstacle counts/shapes are not handled.
80
+ - Late-training evaluation showed high variance (occasional collapses to
81
+ 10% success). The released checkpoint is a single best-of-N draw and may
82
+ not exactly reproduce 95% on a fresh 100-episode evaluation.
83
+ - No EMA was used during training; the base `lerobot/diffusion_pusht` model
84
+ was trained with EMA. Adding EMA is a known follow-up.
85
+
86
+ ## Citation
87
+
88
+ If this checkpoint is useful, please cite the original Diffusion Policy work:
89
+
90
+ ```bibtex
91
+ @article{chi2023diffusion,
92
+ title={Diffusion Policy: Visuomotor Policy Learning via Action Diffusion},
93
+ author={Chi, Cheng and Feng, Siyuan and Du, Yilun and Xu, Zhenjia and Cousineau, Eric and Burchfiel, Benjamin and Song, Shuran},
94
+ journal={The International Journal of Robotics Research},
95
+ year={2023},
96
+ }
97
+ ```
98
+
99
+ And LeRobot:
100
+
101
+ ```bibtex
102
+ @misc{cadene2024lerobot,
103
+ title={LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch},
104
+ author={Cadene, Remi and Alibert, Simon and Soare, Alexander and Gallouedec, Quentin and Zouitine, Adil and Wolf, Thomas},
105
+ year={2024},
106
+ url={https://github.com/huggingface/lerobot},
107
+ }
108
+ ```
config.json ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "type": "diffusion",
3
+ "n_obs_steps": 2,
4
+ "input_features": {
5
+ "observation.image": {
6
+ "type": "VISUAL",
7
+ "shape": [
8
+ 3,
9
+ 96,
10
+ 96
11
+ ]
12
+ },
13
+ "observation.state": {
14
+ "type": "STATE",
15
+ "shape": [
16
+ 2
17
+ ]
18
+ }
19
+ },
20
+ "output_features": {
21
+ "action": {
22
+ "type": "ACTION",
23
+ "shape": [
24
+ 2
25
+ ]
26
+ }
27
+ },
28
+ "device": "cpu",
29
+ "use_amp": false,
30
+ "push_to_hub": true,
31
+ "repo_id": null,
32
+ "private": null,
33
+ "tags": null,
34
+ "license": null,
35
+ "pretrained_path": null,
36
+ "horizon": 16,
37
+ "n_action_steps": 8,
38
+ "normalization_mapping": {
39
+ "ACTION": "MIN_MAX",
40
+ "STATE": "MIN_MAX",
41
+ "VISUAL": "MEAN_STD"
42
+ },
43
+ "drop_n_last_frames": 7,
44
+ "vision_backbone": "resnet18",
45
+ "crop_shape": [
46
+ 84,
47
+ 84
48
+ ],
49
+ "crop_is_random": true,
50
+ "pretrained_backbone_weights": null,
51
+ "use_group_norm": true,
52
+ "spatial_softmax_num_keypoints": 32,
53
+ "use_separate_rgb_encoder_per_camera": false,
54
+ "down_dims": [
55
+ 512,
56
+ 1024,
57
+ 2048
58
+ ],
59
+ "kernel_size": 5,
60
+ "n_groups": 8,
61
+ "diffusion_step_embed_dim": 128,
62
+ "use_film_scale_modulation": true,
63
+ "noise_scheduler_type": "DDPM",
64
+ "num_train_timesteps": 100,
65
+ "beta_schedule": "squaredcos_cap_v2",
66
+ "beta_start": 0.0001,
67
+ "beta_end": 0.02,
68
+ "prediction_type": "epsilon",
69
+ "clip_sample": true,
70
+ "clip_sample_range": 1.0,
71
+ "num_inference_steps": null,
72
+ "do_mask_loss_for_padding": false,
73
+ "optimizer_lr": 0.0001,
74
+ "optimizer_betas": [
75
+ 0.95,
76
+ 0.999
77
+ ],
78
+ "optimizer_eps": 1e-08,
79
+ "optimizer_weight_decay": 1e-06,
80
+ "scheduler_name": "cosine",
81
+ "scheduler_warmup_steps": 500
82
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c80dfe0bd0b823af9db3a67f46faba420d35ea5e02d2069df97922a7850f054
3
+ size 1050861448