| # Vision: SO-ARM 101 Toy-Sorting Pipeline |
|
|
| ## End Goal |
|
|
| Train a manipulation policy that picks up colored toy objects and drops them |
| into matching colored trays, using imitation learning from teleoperated demos |
| recorded in Isaac Sim. |
|
|
| ## Pipeline |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β 1. Simulate β |
| β Isaac Lab ToySortingEnv (Python 3.11 / Isaac Lab 2.3.2) β |
| β β’ Wooden table + SO-ARM 101 β |
| β β’ 3 colored trays (red | green | blue) β |
| β β’ 9 colored toys (3 per color, randomized each episode) β |
| βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Phase 2: ZMQ REP :5555 |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β 2. Collect Demos β |
| β LeRobot / SpaceMouse teleop client (Python 3.12) β |
| β β’ Sends joint targets to sim via ZMQ β |
| β β’ Streams obs/actions into LeRobot Dataset v3 format β |
| β β’ Pushes dataset to HuggingFace Hub β |
| βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β 3. Augment Dataset (optional) β |
| β β’ Background swap, color jitter, domain randomization β |
| β β’ Re-label with reward signal for RL fine-tuning β |
| βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β 4. Train Policy β |
| β lerobot train policy=act (or diffusion_policy) β |
| β β’ Loads dataset from HuggingFace Hub β |
| β β’ Saves checkpoint to Hub β |
| βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β 5. Evaluate in Sim β |
| β β’ Roll out policy in ToySortingEnv β |
| β β’ Log success rate (sort 3 toys correctly in <60 s) β |
| β β’ Push eval metrics to HuggingFace Hub β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ## Container Architecture |
|
|
| ``` |
| ββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββ |
| β sim (isaac-lab:2.3.2, Python 3.11) β β train (python:3.12-slim + lerobot) β |
| β Isaac Lab ToySortingEnv βββββΊβ LeRobot training / data collection β |
| β ZMQ REP server :5555 (Phase 2) βZMQ β IsaacGymClient gymnasium wrapper β |
| β X11 GUI for visualization β β HuggingFace dataset push β |
| ββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ## Phases |
|
|
| | Phase | Status | Description | |
| |-------|--------|-------------| |
| | 1 | **Done** | Isaac Lab env with real USD assets; X11 visualization | |
| | 2 | Scaffolded | ZMQ bridge + LeRobot demo collection client | |
| | 3 | Planned | Dataset augmentation pipeline | |
| | 4 | Planned | ACT / Diffusion Policy training with LeRobot | |
| | 5 | Planned | Closed-loop evaluation + HF metrics push | |
|
|
| ## Asset Strategy |
|
|
| Assets live outside git (large binary files). Two distribution paths: |
|
|
| 1. **Developer machine**: `python assets/download.py --extract` copies the |
| needed USD files from the local `Lightwheel_Xx8T7EPOMd_KitchenRoom/` pack. |
| 2. **Docker / CI**: `python assets/download.py --download` fetches the |
| pre-extracted subset from HuggingFace Hub (`HF_ASSET_REPO` env var). |
|
|
| Neither the git repo nor the Docker image contains asset files directly. |
|
|
| ## Success Metric (Phase 5 target) |
|
|
| > Place all 9 toys into their correct color-matched tray within 60 seconds, |
| > measured over 50 random seeds. Target success rate β₯ 80 %. |
|
|