Instructions to use wsagi/ACT-PickOrange with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use wsagi/ACT-PickOrange with LeRobot:
- Notebooks
- Google Colab
- Kaggle
File size: 10,217 Bytes
92aa51c 4d77147 c7d8732 4d77147 92aa51c c7d8732 92aa51c c7d8732 92aa51c c7d8732 92aa51c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | ---
license: apache-2.0
library_name: lerobot
pipeline_tag: robotics
tags:
- act
- lerobot
- so101
- leisaac
- pick-orange
- isaac-sim
datasets:
- LightwheelAI/leisaac-pick-orange
language:
- en
base_model: lerobot/act
---
# ACT-PickOrange
针对 [LeIsaac SO-101 PickOrange](https://github.com/LightwheelAI/leisaac) 任务从头训练的 [ACT (Action Chunking Transformer)](https://tonyzhaozh.github.io/aloha/) 策略。
_An [ACT (Action Chunking Transformer)](https://tonyzhaozh.github.io/aloha/) policy trained from scratch on the [LeIsaac SO-101 PickOrange](https://github.com/LightwheelAI/leisaac) task._

**🔗 项目仓库 / Project repos**:
- [vitorcen/isaaclab-experience](https://github.com/vitorcen/isaaclab-experience) — Isaac Lab + LeIsaac 多策略横评(parent project)
- [vitorcen/LeIsaac-Training](https://github.com/vitorcen/LeIsaac-Training) — LeIsaac fork(训练脚本 + 设计文档 / training scripts + design docs)
## TL;DR
- **任务 / Task**:`Pick up the orange and place it on the plate` — SO-101 单臂依次夹起 3 颗橙子并放盘子。
_Single-arm SO-101 picks 3 oranges sequentially and places each on a plate._
- **数据集 / Dataset**:[`LightwheelAI/leisaac-pick-orange`](https://huggingface.co/datasets/LightwheelAI/leisaac-pick-orange) — 60 episode 遥操示范。
- **架构 / Architecture**:ACT chunk_size=100,~80M 参数,纯 vision + joint state → action chunk regression(无 LLM / 无 diffusion)。
- **训练 / Training**:batch=8 / lr=1e-5 / 10k step / **关闭图像增强**,~5h on RTX 4090。
- **评测 / Eval**:Isaac Sim 5.1 + LeIsaac,**1/1 success @ 120s sim time**(3 颗全部放盘成功)。
- **⚠️ 关键 inference 配置 / Critical inference setting**:`policy_action_horizon=32`。
默认值 16 会让模型卡在第二颗橙子(爪子抖),8 会卡在第一颗。详见下方 [Inference caveat](#-推理关键配置--critical-inference-caveat)。
## 模型亮点
_Highlights_
- **复刻 + 验证 [shadowHokage/act_policy](https://huggingface.co/shadowHokage/act_policy) 的配方**,得到等价或更好的成功率。
_Reproduces and validates the [shadowHokage/act_policy](https://huggingface.co/shadowHokage/act_policy) recipe with comparable or better success rate._
- **暴露了 LeIsaac 默认 `policy_action_horizon=16` 的隐性陷阱**:chunk_size=100 的 ACT 需要 horizon ≥ 32 才能让宏观运动段完整执行,详见 README 的诊断章节。
_Exposes a hidden trap in LeIsaac's default `policy_action_horizon=16`: ACT models with chunk_size=100 require horizon ≥ 32 to let the macro-motion segment of each chunk execute._
- 无 image augmentation、无 weight decay 调参、无 special trick — 干净的 ACT baseline。
## 训练配方
_Training recipe_
| 项 / Item | 值 / Value |
|---|---|
| Dataset | `LightwheelAI/leisaac-pick-orange` (60 ep, dual-cam 480×640 RGB + 6 DOF state, 30 Hz) |
| Policy | `act` (LeRobot 实现 / LeRobot impl.) |
| Backbone | ResNet18 vision encoder + Transformer encoder/decoder |
| `chunk_size` | 100 |
| `n_action_steps` | 100 |
| Batch size | 8 |
| Optimizer | AdamW |
| Learning rate | 1e-5 (constant) |
| Steps | 10,000 |
| Image augmentation | **disabled** |
| Hardware | RTX 4090 (24 GB) |
| Wall-clock | ~5 hours |
| Recipe credit | [shadowHokage/act_policy](https://huggingface.co/shadowHokage/act_policy) |
训练入口脚本在我们的 LeIsaac fork:[`scripts/training/act/train.sh`](https://github.com/vitorcen/LeIsaac-Training/blob/main/scripts/training/act/train.sh)。
_Training entrypoint script lives in our LeIsaac fork: [`scripts/training/act/train.sh`](https://github.com/vitorcen/LeIsaac-Training/blob/main/scripts/training/act/train.sh)._
## 评测结果
_Eval results_
| 配置 / Config | 第 1 颗 | 第 2 颗 | 第 3 颗 | Episode 成功率 |
|---|---|---|---|---|
| horizon=8 | 🔴 卡死(夹住不动) | — | — | 0/1 |
| horizon=16 | ✅ 成功 | 🟡 爪子抖 / muting | — | 0/1 |
| **horizon=32** | ✅ 成功 | ✅ 折腾后成功 | ✅ 折腾后成功 | **1/1** ✅ |
测试环境 / Test setup:Isaac Sim 5.1,task `LeIsaac-SO101-PickOrange-v0`,`episode_length_s=120`,`step_hz=30`,dual-cam 观测。
_Test setup: Isaac Sim 5.1, task `LeIsaac-SO101-PickOrange-v0`, `episode_length_s=120`, `step_hz=30`, dual-cam observations._
**单 sample 警告 / Single-sample caveat**:以上 1/1 是单一 episode 结果,未跑统计意义上的多轮平均。但 horizon=8 / 16 / 32 三个失败模式的 monotonic 趋势 (失败 → 部分失败 → 成功) 足以做 falsification — 不是模型问题,是配置问题。
_The 1/1 success rate is from a single episode, not statistically averaged. However, the monotonic failure-mode pattern across horizon=8/16/32 (stuck → jitter → success) is sufficient as a falsification: this is a configuration issue, not a model capability issue._
## ⚠️ 推理关键配置 / Critical inference caveat
**ACT chunk_size=100 + 默认 horizon=16 = 第二颗橙子永远过不去。** 这不是 ACT 的弱点,是 LeIsaac 默认配置的隐性陷阱。
_**ACT chunk_size=100 + the default horizon=16 will deadlock on the 2nd orange.** This is not an ACT weakness; it's a hidden trap in LeIsaac's default config._
### 根因 / Root cause
ACT 每个 chunk 输出 100 步动作,是一段**完整规划**:前 ~10 步是"启动 / 加速",中段 (step 20-80) 才是真正的**宏观运动**(接近 → 夹起 → 提起 → 运送 → 释放)。LeRobot async client 用直接窗口 (receding horizon),每 `policy_action_horizon` 步重新查询一次。
_Each ACT chunk outputs a 100-step planned trajectory: the first ~10 steps are "startup", and steps 20-80 are the macro-motion (approach → grasp → lift → transport → release). The LeRobot async client uses a sliding window, re-querying every `policy_action_horizon` steps._
- horizon=8 → 每次只执行前 8 步就丢掉重 query → 永远在执行"启动段",**根本到不了宏观运动** → 卡死。
_horizon=8 → only the first 8 startup steps are ever executed → the macro-motion never fires → deadlock._
- horizon=16 → 够第 1 颗的简单"靠近→夹起",但第 2 颗的"放→后退→接近第 2 颗"复杂段需要更长执行窗 → 模型 OOD + 短 horizon 双重打击 → 抖。
_horizon=16 → enough for the simple "approach → grasp" of orange #1, but the post-1st-orange transition demands a longer execution window → OOD state + short horizon compound → jitter._
- horizon=32 → 给 macro-motion 完整执行机会,1/1 通过。
### 推荐配置 / Recommended settings
```bash
--policy_type=lerobot-act
--policy_action_horizon=32
--policy_checkpoint_path=<path-to-this-model>
--step_hz=30 # 对齐 dataset 30Hz / matches dataset 30Hz
--episode_length_s=120
```
## 使用方法
_Usage_
### 1. 启动 LeRobot async policy_server
```bash
pip install lerobot
python -m lerobot.async_inference.policy_server --host 0.0.0.0 --port 8080
```
### 2. 客户端启动 LeIsaac eval
通过我们的 [vitorcen/LeIsaac-Training](https://github.com/vitorcen/LeIsaac-Training) fork:
```bash
cd LeIsaac
bash scripts/evaluation/run_eval.sh -- \
--task=LeIsaac-SO101-PickOrange-v0 \
--eval_rounds=3 \
--episode_length_s=120 \
--step_hz=30 \
--policy_type=lerobot-act \
--policy_host=127.0.0.1 --policy_port=8080 \
--policy_checkpoint_path=wsagi/ACT-PickOrange \
--policy_action_horizon=32 \
--policy_language_instruction="Pick up the orange and place it on the plate" \
--device=cuda --enable_cameras
```
`run_eval.sh` 自动按 user-patience cap 计算 wall-clock timeout,避免无意义等待慢推理。
_`run_eval.sh` auto-computes a user-patience wall-clock timeout so slow inference fails fast._
## 局限性
_Limitations_
- **数据集 OOD on 2nd-3rd orange**:dataset 60 episode × 每集 1 次"放第 N 颗"演示。第 2/3 颗的 state coverage 比第 1 颗稀疏一个数量级,model 在那里 monotonic 变难、动作变"折腾"。即便 horizon=32 救了形式上的成功率,**精度仍随颗数线性退化**。这是数据问题不是模型问题。
_**Dataset OOD on 2nd–3rd orange**: with 60 episodes × 1 "place N-th orange" demo each, state coverage drops by ~1 order of magnitude per orange. Even at horizon=32 the policy gets visibly more jittery on later oranges. This is a data issue, not a model issue._
- 三个独立架构 (我们的 ACT / Diffusion Policy / SmolVLA / 公开 shadowHokage ACT) 在同一 dataset 上 **共同 OOD on 3rd orange** — 全 family 共病。
- 无图像增强、无 domain randomization → real-world transfer 可能弱。本 ckpt 仅用于 Isaac Sim 仿真验证,不保证真机 deploy。
_No image augmentation or domain randomization → real-world transfer is likely weak. This checkpoint is only validated in Isaac Sim simulation; real-robot deployment is not guaranteed._
## 相关
_Related_
- 同任务对照 / Same-task comparisons:
- [`wsagi/DiffusionPolicy-PickOrange`](https://huggingface.co/wsagi/DiffusionPolicy-PickOrange) — 自训 Diffusion Policy (267M, DDIM 32-step swap)
- [`shadowHokage/act_policy`](https://huggingface.co/shadowHokage/act_policy) — 同配方公开 ckpt(我们的复刻参考)
- [`LightwheelAI/leisaac-pick-orange-v0`](https://huggingface.co/LightwheelAI/leisaac-pick-orange-v0) — GR00T N1.5 SOTA(30s 完成 3 颗)
- 完整训练 + eval 配方:[vitorcen/LeIsaac-Training](https://github.com/vitorcen/LeIsaac-Training) fork
## 致谢
_Acknowledgments_
- LeIsaac 团队 + LightwheelAI 提供任务环境和数据集
- LeRobot 团队提供 ACT 实现 + async inference 框架
- shadowHokage 公开训练配方作为复刻基线
## 引用
_Citation_
```bibtex
@inproceedings{zhao2023learning,
title={Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware},
author={Zhao, Tony Z. and Kumar, Vikash and Levine, Sergey and Finn, Chelsea},
booktitle={Robotics: Science and Systems},
year={2023}
}
```
## License
Apache-2.0
|