# Vision: SO-ARM 101 Toy-Sorting Pipeline ## End Goal Train a manipulation policy that picks up colored toy objects and drops them into matching colored trays, using imitation learning from teleoperated demos recorded in Isaac Sim. ## Pipeline ``` ┌─────────────────────────────────────────────────────────────────────┐ │ 1. Simulate │ │ Isaac Lab ToySortingEnv (Python 3.11 / Isaac Lab 2.3.2) │ │ • Wooden table + SO-ARM 101 │ │ • 3 colored trays (red | green | blue) │ │ • 9 colored toys (3 per color, randomized each episode) │ └─────────────────┬───────────────────────────────────────────────────┘ │ Phase 2: ZMQ REP :5555 ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ 2. Collect Demos │ │ LeRobot / SpaceMouse teleop client (Python 3.12) │ │ • Sends joint targets to sim via ZMQ │ │ • Streams obs/actions into LeRobot Dataset v3 format │ │ • Pushes dataset to HuggingFace Hub │ └─────────────────┬───────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ 3. Augment Dataset (optional) │ │ • Background swap, color jitter, domain randomization │ │ • Re-label with reward signal for RL fine-tuning │ └─────────────────┬───────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ 4. Train Policy │ │ lerobot train policy=act (or diffusion_policy) │ │ • Loads dataset from HuggingFace Hub │ │ • Saves checkpoint to Hub │ └─────────────────┬───────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ 5. Evaluate in Sim │ │ • Roll out policy in ToySortingEnv │ │ • Log success rate (sort 3 toys correctly in <60 s) │ │ • Push eval metrics to HuggingFace Hub │ └─────────────────────────────────────────────────────────────────────┘ ``` ## Container Architecture ``` ┌────────────────────────────────────────┐ ┌──────────────────────────────────────┐ │ sim (isaac-lab:2.3.2, Python 3.11) │ │ train (python:3.12-slim + lerobot) │ │ Isaac Lab ToySortingEnv │◄──►│ LeRobot training / data collection │ │ ZMQ REP server :5555 (Phase 2) │ZMQ │ IsaacGymClient gymnasium wrapper │ │ X11 GUI for visualization │ │ HuggingFace dataset push │ └────────────────────────────────────────┘ └──────────────────────────────────────┘ ``` ## Phases | Phase | Status | Description | |-------|--------|-------------| | 1 | **Done** | Isaac Lab env with real USD assets; X11 visualization | | 2 | Scaffolded | ZMQ bridge + LeRobot demo collection client | | 3 | Planned | Dataset augmentation pipeline | | 4 | Planned | ACT / Diffusion Policy training with LeRobot | | 5 | Planned | Closed-loop evaluation + HF metrics push | ## Asset Strategy Assets live outside git (large binary files). Two distribution paths: 1. **Developer machine**: `python assets/download.py --extract` copies the needed USD files from the local `Lightwheel_Xx8T7EPOMd_KitchenRoom/` pack. 2. **Docker / CI**: `python assets/download.py --download` fetches the pre-extracted subset from HuggingFace Hub (`HF_ASSET_REPO` env var). Neither the git repo nor the Docker image contains asset files directly. ## Success Metric (Phase 5 target) > Place all 9 toys into their correct color-matched tray within 60 seconds, > measured over 50 random seeds. Target success rate ≥ 80 %.