Update README.md
Browse files
README.md
CHANGED
|
@@ -12,7 +12,6 @@ tags:
|
|
| 12 |
- aloha
|
| 13 |
- imitation-learning
|
| 14 |
- benchmark
|
| 15 |
-
|
| 16 |
---
|
| 17 |
|
| 18 |
# 🦾 Diffusion Policy for Aloha Insertion (200k Steps)
|
|
@@ -22,6 +21,10 @@ tags:
|
|
| 22 |
[](https://www.uestc.edu.cn/)
|
| 23 |
[](https://www.apache.org/licenses/LICENSE-2.0)
|
| 24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
> **Summary:** This model represents a benchmark experiment for **Diffusion Policy** on the challenging **Aloha Insertion** task (Simulated). It was trained using the [LeRobot](https://github.com/huggingface/lerobot) framework to evaluate the algorithm's performance on complex, high-dimensional 3D manipulation tasks compared to baseline methods.
|
| 26 |
|
| 27 |
- **🧩 Task**: Aloha Insertion (Simulated, 3D)
|
|
@@ -89,58 +92,58 @@ python -m lerobot.scripts.lerobot_train \
|
|
| 89 |
```yaml
|
| 90 |
# @package _global_
|
| 91 |
|
| 92 |
-
#
|
| 93 |
seed: 100000
|
| 94 |
job_name: Diffusion-Aloha-Insertion
|
| 95 |
|
| 96 |
-
#
|
| 97 |
-
steps: 200000 #
|
| 98 |
-
eval_freq: 20000 #
|
| 99 |
save_freq: 20000
|
| 100 |
log_freq: 200
|
| 101 |
-
batch_size: 8 # ⚠️
|
| 102 |
|
| 103 |
-
#
|
| 104 |
dataset:
|
| 105 |
repo_id: lerobot/aloha_sim_insertion_human
|
| 106 |
|
| 107 |
-
#
|
| 108 |
eval:
|
| 109 |
n_episodes: 50
|
| 110 |
-
batch_size: 8 #
|
| 111 |
|
| 112 |
-
#
|
| 113 |
env:
|
| 114 |
type: aloha
|
| 115 |
task: AlohaInsertion-v0
|
| 116 |
fps: 50
|
| 117 |
|
| 118 |
-
#
|
| 119 |
policy:
|
| 120 |
type: diffusion
|
| 121 |
|
| 122 |
-
# ---
|
| 123 |
vision_backbone: resnet18
|
| 124 |
-
# Aloha
|
| 125 |
crop_shape: [420, 560]
|
| 126 |
crop_is_random: true
|
| 127 |
-
pretrained_backbone_weights: null #
|
| 128 |
use_group_norm: true
|
| 129 |
spatial_softmax_num_keypoints: 32
|
| 130 |
|
| 131 |
-
# --- Diffusion
|
| 132 |
down_dims: [512, 1024, 2048]
|
| 133 |
kernel_size: 5
|
| 134 |
n_groups: 8
|
| 135 |
diffusion_step_embed_dim: 128
|
| 136 |
use_film_scale_modulation: true
|
| 137 |
|
| 138 |
-
# ---
|
| 139 |
n_action_steps: 8
|
| 140 |
n_obs_steps: 2
|
| 141 |
horizon: 16
|
| 142 |
|
| 143 |
-
# ---
|
| 144 |
noise_scheduler_type: DDPM
|
| 145 |
num_train_timesteps: 100
|
| 146 |
num_inference_timesteps: 100
|
|
@@ -151,7 +154,7 @@ policy:
|
|
| 151 |
clip_sample: true
|
| 152 |
clip_sample_range: 1.0
|
| 153 |
|
| 154 |
-
# ---
|
| 155 |
optimizer_lr: 1e-4
|
| 156 |
optimizer_weight_decay: 1e-6
|
| 157 |
#grad_clip_norm: 10
|
|
@@ -189,4 +192,4 @@ python -m lerobot.scripts.lerobot_eval \
|
|
| 189 |
--eval.batch_size 8 \
|
| 190 |
--env.type aloha \
|
| 191 |
--env.task AlohaInsertion-v0
|
| 192 |
-
```
|
|
|
|
| 12 |
- aloha
|
| 13 |
- imitation-learning
|
| 14 |
- benchmark
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
# 🦾 Diffusion Policy for Aloha Insertion (200k Steps)
|
|
|
|
| 21 |
[](https://www.uestc.edu.cn/)
|
| 22 |
[](https://www.apache.org/licenses/LICENSE-2.0)
|
| 23 |
|
| 24 |
+
## 🎯 Research Purpose
|
| 25 |
+
|
| 26 |
+
**Important Note:** This model was trained primarily for **academic comparison**—evaluating the performance difference between **Diffusion Policy** and **ACT** algorithms under identical training conditions (using the `lerobot/aloha_sim_insertion_human` dataset). This is a benchmark experiment designed to analyze different algorithms' learning capabilities for complex 3D manipulation tasks under limited computational resources (Batch Size=8), **not to train a highly successful practical model**.
|
| 27 |
+
|
| 28 |
> **Summary:** This model represents a benchmark experiment for **Diffusion Policy** on the challenging **Aloha Insertion** task (Simulated). It was trained using the [LeRobot](https://github.com/huggingface/lerobot) framework to evaluate the algorithm's performance on complex, high-dimensional 3D manipulation tasks compared to baseline methods.
|
| 29 |
|
| 30 |
- **🧩 Task**: Aloha Insertion (Simulated, 3D)
|
|
|
|
| 92 |
```yaml
|
| 93 |
# @package _global_
|
| 94 |
|
| 95 |
+
# Random seed
|
| 96 |
seed: 100000
|
| 97 |
job_name: Diffusion-Aloha-Insertion
|
| 98 |
|
| 99 |
+
# Training parameters
|
| 100 |
+
steps: 200000 # Original file states 200k steps (Aloha is difficult to train)
|
| 101 |
+
eval_freq: 20000 # Slightly increased frequency to monitor progress
|
| 102 |
save_freq: 20000
|
| 103 |
log_freq: 200
|
| 104 |
+
batch_size: 8 # ⚠️ Crucial: Aloha requires small batch size, otherwise 8GB VRAM is insufficient
|
| 105 |
|
| 106 |
+
# Dataset
|
| 107 |
dataset:
|
| 108 |
repo_id: lerobot/aloha_sim_insertion_human
|
| 109 |
|
| 110 |
+
# Evaluation settings
|
| 111 |
eval:
|
| 112 |
n_episodes: 50
|
| 113 |
+
batch_size: 8 # Keep consistent with training
|
| 114 |
|
| 115 |
+
# Environment settings
|
| 116 |
env:
|
| 117 |
type: aloha
|
| 118 |
task: AlohaInsertion-v0
|
| 119 |
fps: 50
|
| 120 |
|
| 121 |
+
# Policy configuration
|
| 122 |
policy:
|
| 123 |
type: diffusion
|
| 124 |
|
| 125 |
+
# --- Vision processing ---
|
| 126 |
vision_backbone: resnet18
|
| 127 |
+
# Aloha images are rectangular, using specific crop dimensions here
|
| 128 |
crop_shape: [420, 560]
|
| 129 |
crop_is_random: true
|
| 130 |
+
pretrained_backbone_weights: null # Original config specifies not to load pretrained weights
|
| 131 |
use_group_norm: true
|
| 132 |
spatial_softmax_num_keypoints: 32
|
| 133 |
|
| 134 |
+
# --- Diffusion core architecture (U-Net) ---
|
| 135 |
down_dims: [512, 1024, 2048]
|
| 136 |
kernel_size: 5
|
| 137 |
n_groups: 8
|
| 138 |
diffusion_step_embed_dim: 128
|
| 139 |
use_film_scale_modulation: true
|
| 140 |
|
| 141 |
+
# --- Action prediction parameters ---
|
| 142 |
n_action_steps: 8
|
| 143 |
n_obs_steps: 2
|
| 144 |
horizon: 16
|
| 145 |
|
| 146 |
+
# --- Noise scheduler (DDPM) ---
|
| 147 |
noise_scheduler_type: DDPM
|
| 148 |
num_train_timesteps: 100
|
| 149 |
num_inference_timesteps: 100
|
|
|
|
| 154 |
clip_sample: true
|
| 155 |
clip_sample_range: 1.0
|
| 156 |
|
| 157 |
+
# --- Optimizer ---
|
| 158 |
optimizer_lr: 1e-4
|
| 159 |
optimizer_weight_decay: 1e-6
|
| 160 |
#grad_clip_norm: 10
|
|
|
|
| 192 |
--eval.batch_size 8 \
|
| 193 |
--env.type aloha \
|
| 194 |
--env.task AlohaInsertion-v0
|
| 195 |
+
```
|