File size: 4,172 Bytes

c171645
4caea57
2e2e03e
4caea57
c171645
 
 
 
 
4caea57
 
 
 
 
 
c171645
 
4caea57
 
 
 
36c622e
4caea57
36c622e
 
c171645
4caea57
 
36c622e
4caea57
36c622e
 
 
 
 
4caea57
 
36c622e
 
 
 
4caea57
 
 
 
 
f53ae76
36c622e
 
 
c171645
36c622e
 
 
 
 
 
 
4caea57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c171645
 
cd33e2c
 
 
 
 
 
 
4caea57
cd33e2c
4caea57
 
 
cd33e2c
4d2e3a5
 
 
 
 
 
4caea57
4d2e3a5
4caea57

---

datasets:
  - lerobot/pusht
library_name: lerobot
license: apache-2.0
model_name: diffusion
pipeline_tag: robotics
tags:
  - lerobot
  - robotics
  - diffusion
  - pusht
  - imitation-learning
  - benchmark
---



# 🦾 Diffusion Policy for Push-T (200k Steps)

[![LeRobot](https://img.shields.io/badge/Library-LeRobot-yellow)](https://github.com/huggingface/lerobot)
[![Task](https://img.shields.io/badge/Task-Push--T-blue)](https://huggingface.co/datasets/lerobot/pusht)
[![UESTC](https://img.shields.io/badge/Author-UESTC_Graduate-red)](https://www.uestc.edu.cn/)
[![License](https://img.shields.io/badge/License-Apache_2.0-green)](https://www.apache.org/licenses/LICENSE-2.0)

> **Summary:** This model demonstrates the capabilities of **Diffusion Policy** on the precision-demanding **Push-T** task. It was trained using the [LeRobot](https://github.com/huggingface/lerobot) framework as part of a thesis research project benchmarking Imitation Learning algorithms.
- **🧩 Task**: Push-T (Simulated)
- **🧠 Algorithm**: [Diffusion Policy](https://huggingface.co/papers/2303.04137) (DDPM)
- **🔄 Training Steps**: 200,000 (Fine-tuned via Resume)
- **🎓 Author**: Graduate Student, **UESTC** (University of Electronic Science and Technology of China)
---

## 🔬 Benchmark Results (vs ACT)

Compared to the ACT baseline (which achieved **0%** success rate in our controlled experiments), this Diffusion Policy model demonstrates significantly better control precision and trajectory stability.

### 📊 Evaluation Metrics (50 Episodes)

| Metric | Value | Comparison to ACT Baseline | Status |
| :--- | :---: | :--- | :---: |
| **Success Rate** | **14.0%** | **Significant Improvement** (ACT: 0%) | 🏆 |
| **Avg Max Reward** | **0.81** | **+58% Higher Precision** (ACT: ~0.51) | 📈 |
| **Avg Sum Reward** | **130.46** | **+147% More Stable** (ACT: ~52.7) | ✅ |

> **Note:** The Push-T environment requires **>95% target coverage** for success. An average max reward of `0.81` indicates the policy consistently moves the block very close to the target position, proving strong manipulation capabilities despite the strict success threshold.
 
---

## ⚙️ Model Details

| Parameter | Description |
| :--- | :--- |
| **Architecture** | ResNet18 (Vision Backbone) + U-Net (Diffusion Head) |
| **Prediction Horizon** | 16 steps |
| **Observation History** | 2 steps |
| **Action Steps** | 8 steps |

- **Training Strategy**: 
  - Phase 1: Initial training (100,000 steps) -> Model: `Lemon-03/DP_PushT_test`
  - Phase 2: Resume/Fine-tuning (+100,000 steps) -> Model: `Lemon-03/DP_PushT_test_Resume`
  - **Total**: 200,000 steps
---

## 🔧 Training Configuration (Reference)

For reproducibility, here are the key parameters used during the training session:

  - **Batch Size**: 64
  - **Optimizer**: AdamW (`lr=1e-4`)
  - **Scheduler**: Cosine with warmup
  - **Vision**: ResNet18 with random crop (84x84)
  - **Precision**: Mixed Precision (AMP) enabled

#### Original Training Command (My Resume Mode)

```bash
python -m lerobot.scripts.lerobot_train \
  --policy.type diffusion \
  --env.type pusht \
  --dataset.repo_id lerobot/pusht \
  --wandb.enable true \
  --eval.batch_size 8 \
  --job_name DP_PushT_Resume \
  --policy.repo_id Lemon-03/DP_PushT_test_Resume \
  --policy.pretrained_path outputs/train/2025-12-02/14-33-35_DP_PushT/checkpoints/last/pretrained_model \
  --steps 100000
```

---

## 🚀 Evaluate (My Evaluation Mode)

Run the following command in your terminal to evaluate the model for 50 episodes and save the visualization videos:

```bash
python -m lerobot.scripts.lerobot_eval \
  --policy.type diffusion \
  --policy.pretrained_path outputs/train/2025-12-04/14-47-37_DP_PushT_Resume/checkpoints/last/pretrained_model \
  --eval.n_episodes 50 \
  --eval.batch_size 10 \
  --env.type pusht \
  --env.task PushT-v0
```

To evaluate this model locally, run the following command:

```bash
python -m lerobot.scripts.lerobot_eval \
  --policy.type diffusion \
  --policy.pretrained_path Lemon-03/DP_PushT_test_Resume \
  --eval.n_episodes 50 \
  --eval.batch_size 10 \
  --env.type pusht \
  --env.task PushT-v0
```

-----