ACT β ALOHA Single-Arm (Left) β 13.4k steps
Action Chunking Transformer (ACT) policy for a single-arm (LEFT) Trossen ALOHA manipulation task β autonomous O2 mask placement on a human surrogate (MEDEVAC-inspired).
This is the initial 13.4k-step baseline run (S001). For the production-shipped 40k retrain, see JHeisler/aloha_solo_left_4_6_26_act_left_40k.
Training Config
| Field | Value |
|---|---|
| Architecture | ACT (ResNet18 backbone + 4-layer Transformer encoder + VAE chunking head) |
| Dataset | JHeisler/aloha_solo_left_4_6_26 β 50 episodes, 29,785 samples, 30 fps |
| State / action dim | 9 / 9 |
| Cameras | cam_high, cam_left_wrist (3Γ480Γ640 each) |
| Steps | 13,400 |
| Batch size | 48 |
| Learning rate | 6e-5 (linear warmup 500 β cosine) |
| Total samples seen | |
| AMP | enabled |
| torch.compile | enabled |
| Final loss | 0.029 |
| Final grad norm | 0.80 |
| Wall clock | ~2h 3min on RTX A4500 |
| LeRobot pin | 96c7052777aca85d4e55dfba8f81586103ba8f61 |
Hardware
Trained locally on NVIDIA RTX A4500 (20 GB VRAM). Pipeline ported from the original Colab notebook (Tesla A100, batch=8, 80,000 steps); local-A4500 config preserves total samples seen at 2.5Γ the wall-clock speedup.
Project Lineage
This is part of a 4-policy comparison study on the same dataset:
| Workstream | Model | Steps | Samples | HF |
|---|---|---|---|---|
| S001 | ACT | 13,400 | 640K | this repo |
| S002 | Hybrid ACT+Diffusion | 13,400 | 321K | act_diffusion |
| S003 | ACT (shipped) | 40,000 | 1.92M | act_left_40k |
| S004 | Hybrid ACT+Diffusion | 40,000 | 1.12M | act_diffusion_40k |
Usage
from lerobot.common.policies.act.modeling_act import ACTPolicy
policy = ACTPolicy.from_pretrained("JHeisler/aloha_solo_left_4_6_26_act_left")
Citation / Course
EN.525.681 school project β JHU Whiting School of Engineering. Team: Jake Heisler, Laura Kroening, Purushottam Shukla.
Code reference: HuggingFace LeRobot at commit 96c7052.
- Downloads last month
- 61