File size: 6,187 Bytes
16f1ab9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 | # Vision: SO-ARM 101 Toy-Sorting Pipeline
## End Goal
Train a manipulation policy that picks up colored toy objects and drops them
into matching colored trays, using imitation learning from teleoperated demos
recorded in Isaac Sim.
## Pipeline
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1. Simulate β
β Isaac Lab ToySortingEnv (Python 3.11 / Isaac Lab 2.3.2) β
β β’ Wooden table + SO-ARM 101 β
β β’ 3 colored trays (red | green | blue) β
β β’ 9 colored toys (3 per color, randomized each episode) β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 2: ZMQ REP :5555
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 2. Collect Demos β
β LeRobot / SpaceMouse teleop client (Python 3.12) β
β β’ Sends joint targets to sim via ZMQ β
β β’ Streams obs/actions into LeRobot Dataset v3 format β
β β’ Pushes dataset to HuggingFace Hub β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 3. Augment Dataset (optional) β
β β’ Background swap, color jitter, domain randomization β
β β’ Re-label with reward signal for RL fine-tuning β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 4. Train Policy β
β lerobot train policy=act (or diffusion_policy) β
β β’ Loads dataset from HuggingFace Hub β
β β’ Saves checkpoint to Hub β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 5. Evaluate in Sim β
β β’ Roll out policy in ToySortingEnv β
β β’ Log success rate (sort 3 toys correctly in <60 s) β
β β’ Push eval metrics to HuggingFace Hub β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
## Container Architecture
```
ββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββ
β sim (isaac-lab:2.3.2, Python 3.11) β β train (python:3.12-slim + lerobot) β
β Isaac Lab ToySortingEnv βββββΊβ LeRobot training / data collection β
β ZMQ REP server :5555 (Phase 2) βZMQ β IsaacGymClient gymnasium wrapper β
β X11 GUI for visualization β β HuggingFace dataset push β
ββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββ
```
## Phases
| Phase | Status | Description |
|-------|--------|-------------|
| 1 | **Done** | Isaac Lab env with real USD assets; X11 visualization |
| 2 | Scaffolded | ZMQ bridge + LeRobot demo collection client |
| 3 | Planned | Dataset augmentation pipeline |
| 4 | Planned | ACT / Diffusion Policy training with LeRobot |
| 5 | Planned | Closed-loop evaluation + HF metrics push |
## Asset Strategy
Assets live outside git (large binary files). Two distribution paths:
1. **Developer machine**: `python assets/download.py --extract` copies the
needed USD files from the local `Lightwheel_Xx8T7EPOMd_KitchenRoom/` pack.
2. **Docker / CI**: `python assets/download.py --download` fetches the
pre-extracted subset from HuggingFace Hub (`HF_ASSET_REPO` env var).
Neither the git repo nor the Docker image contains asset files directly.
## Success Metric (Phase 5 target)
> Place all 9 toys into their correct color-matched tray within 60 seconds,
> measured over 50 random seeds. Target success rate β₯ 80 %.
|