Vision: SO-ARM 101 Toy-Sorting Pipeline

End Goal

Train a manipulation policy that picks up colored toy objects and drops them into matching colored trays, using imitation learning from teleoperated demos recorded in Isaac Sim.

Pipeline

┌─────────────────────────────────────────────────────────────────────┐
│  1. Simulate                                                         │
│     Isaac Lab ToySortingEnv (Python 3.11 / Isaac Lab 2.3.2)         │
│     • Wooden table + SO-ARM 101                                      │
│     • 3 colored trays  (red | green | blue)                          │
│     • 9 colored toys   (3 per color, randomized each episode)        │
└─────────────────┬───────────────────────────────────────────────────┘
                  │  Phase 2: ZMQ REP :5555
                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│  2. Collect Demos                                                    │
│     LeRobot / SpaceMouse teleop client (Python 3.12)                 │
│     • Sends joint targets to sim via ZMQ                             │
│     • Streams obs/actions into LeRobot Dataset v3 format             │
│     • Pushes dataset to HuggingFace Hub                              │
└─────────────────┬───────────────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│  3. Augment Dataset (optional)                                       │
│     • Background swap, color jitter, domain randomization           │
│     • Re-label with reward signal for RL fine-tuning                 │
└─────────────────┬───────────────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│  4. Train Policy                                                     │
│     lerobot train policy=act  (or diffusion_policy)                  │
│     • Loads dataset from HuggingFace Hub                             │
│     • Saves checkpoint to Hub                                        │
└─────────────────┬───────────────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│  5. Evaluate in Sim                                                  │
│     • Roll out policy in ToySortingEnv                               │
│     • Log success rate (sort 3 toys correctly in <60 s)             │
│     • Push eval metrics to HuggingFace Hub                           │
└─────────────────────────────────────────────────────────────────────┘

Container Architecture

┌────────────────────────────────────────┐    ┌──────────────────────────────────────┐
│  sim  (isaac-lab:2.3.2, Python 3.11)   │    │  train  (python:3.12-slim + lerobot) │
│  Isaac Lab ToySortingEnv               │◄──►│  LeRobot training / data collection  │
│  ZMQ REP server :5555  (Phase 2)       │ZMQ │  IsaacGymClient gymnasium wrapper    │
│  X11 GUI for visualization             │    │  HuggingFace dataset push            │
└────────────────────────────────────────┘    └──────────────────────────────────────┘

Phases

Phase	Status	Description
1	Done	Isaac Lab env with real USD assets; X11 visualization
2	Scaffolded	ZMQ bridge + LeRobot demo collection client
3	Planned	Dataset augmentation pipeline
4	Planned	ACT / Diffusion Policy training with LeRobot
5	Planned	Closed-loop evaluation + HF metrics push

Asset Strategy

Assets live outside git (large binary files). Two distribution paths:

Developer machine: python assets/download.py --extract copies the needed USD files from the local Lightwheel_Xx8T7EPOMd_KitchenRoom/ pack.
Docker / CI: python assets/download.py --download fetches the pre-extracted subset from HuggingFace Hub (HF_ASSET_REPO env var).

Neither the git repo nor the Docker image contains asset files directly.

Success Metric (Phase 5 target)

Place all 9 toys into their correct color-matched tray within 60 seconds, measured over 50 random seeds. Target success rate ≥ 80 %.