File size: 4,172 Bytes
c171645 4caea57 2e2e03e 4caea57 c171645 4caea57 c171645 4caea57 36c622e 4caea57 36c622e c171645 4caea57 36c622e 4caea57 36c622e 4caea57 36c622e 4caea57 f53ae76 36c622e c171645 36c622e 4caea57 c171645 cd33e2c 4caea57 cd33e2c 4caea57 cd33e2c 4d2e3a5 4caea57 4d2e3a5 4caea57 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | ---
datasets:
- lerobot/pusht
library_name: lerobot
license: apache-2.0
model_name: diffusion
pipeline_tag: robotics
tags:
- lerobot
- robotics
- diffusion
- pusht
- imitation-learning
- benchmark
---
# π¦Ύ Diffusion Policy for Push-T (200k Steps)
[](https://github.com/huggingface/lerobot)
[](https://huggingface.co/datasets/lerobot/pusht)
[](https://www.uestc.edu.cn/)
[](https://www.apache.org/licenses/LICENSE-2.0)
> **Summary:** This model demonstrates the capabilities of **Diffusion Policy** on the precision-demanding **Push-T** task. It was trained using the [LeRobot](https://github.com/huggingface/lerobot) framework as part of a thesis research project benchmarking Imitation Learning algorithms.
- **π§© Task**: Push-T (Simulated)
- **π§ Algorithm**: [Diffusion Policy](https://huggingface.co/papers/2303.04137) (DDPM)
- **π Training Steps**: 200,000 (Fine-tuned via Resume)
- **π Author**: Graduate Student, **UESTC** (University of Electronic Science and Technology of China)
---
## π¬ Benchmark Results (vs ACT)
Compared to the ACT baseline (which achieved **0%** success rate in our controlled experiments), this Diffusion Policy model demonstrates significantly better control precision and trajectory stability.
### π Evaluation Metrics (50 Episodes)
| Metric | Value | Comparison to ACT Baseline | Status |
| :--- | :---: | :--- | :---: |
| **Success Rate** | **14.0%** | **Significant Improvement** (ACT: 0%) | π |
| **Avg Max Reward** | **0.81** | **+58% Higher Precision** (ACT: ~0.51) | π |
| **Avg Sum Reward** | **130.46** | **+147% More Stable** (ACT: ~52.7) | β
|
> **Note:** The Push-T environment requires **>95% target coverage** for success. An average max reward of `0.81` indicates the policy consistently moves the block very close to the target position, proving strong manipulation capabilities despite the strict success threshold.
---
## βοΈ Model Details
| Parameter | Description |
| :--- | :--- |
| **Architecture** | ResNet18 (Vision Backbone) + U-Net (Diffusion Head) |
| **Prediction Horizon** | 16 steps |
| **Observation History** | 2 steps |
| **Action Steps** | 8 steps |
- **Training Strategy**:
- Phase 1: Initial training (100,000 steps) -> Model: `Lemon-03/DP_PushT_test`
- Phase 2: Resume/Fine-tuning (+100,000 steps) -> Model: `Lemon-03/DP_PushT_test_Resume`
- **Total**: 200,000 steps
---
## π§ Training Configuration (Reference)
For reproducibility, here are the key parameters used during the training session:
- **Batch Size**: 64
- **Optimizer**: AdamW (`lr=1e-4`)
- **Scheduler**: Cosine with warmup
- **Vision**: ResNet18 with random crop (84x84)
- **Precision**: Mixed Precision (AMP) enabled
#### Original Training Command (My Resume Mode)
```bash
python -m lerobot.scripts.lerobot_train \
--policy.type diffusion \
--env.type pusht \
--dataset.repo_id lerobot/pusht \
--wandb.enable true \
--eval.batch_size 8 \
--job_name DP_PushT_Resume \
--policy.repo_id Lemon-03/DP_PushT_test_Resume \
--policy.pretrained_path outputs/train/2025-12-02/14-33-35_DP_PushT/checkpoints/last/pretrained_model \
--steps 100000
```
---
## π Evaluate (My Evaluation Mode)
Run the following command in your terminal to evaluate the model for 50 episodes and save the visualization videos:
```bash
python -m lerobot.scripts.lerobot_eval \
--policy.type diffusion \
--policy.pretrained_path outputs/train/2025-12-04/14-47-37_DP_PushT_Resume/checkpoints/last/pretrained_model \
--eval.n_episodes 50 \
--eval.batch_size 10 \
--env.type pusht \
--env.task PushT-v0
```
To evaluate this model locally, run the following command:
```bash
python -m lerobot.scripts.lerobot_eval \
--policy.type diffusion \
--policy.pretrained_path Lemon-03/DP_PushT_test_Resume \
--eval.n_episodes 50 \
--eval.batch_size 10 \
--env.type pusht \
--env.task PushT-v0
```
----- |