ACT — ALOHA Single-Arm (Left) — 40k steps (SHIPPED)

Action Chunking Transformer (ACT) policy for a single-arm (LEFT) Trossen ALOHA manipulation task — autonomous O2 mask placement on a human surrogate (MEDEVAC-inspired).

This is the production-shipped retrain at 40,000 steps (workstream S003). It supersedes the initial 13.4k baseline at JHeisler/aloha_solo_left_4_6_26_act_left. Same architecture, same dataset, ~3× the gradient updates and ~3× the data exposure.

Training Config

Field	Value
Architecture	ACT (ResNet18 backbone + 4-layer Transformer encoder + VAE chunking head)
Dataset	JHeisler/aloha_solo_left_4_6_26 — 50 episodes, 29,785 samples, 30 fps
State / action dim	9 / 9
Cameras	`cam_high`, `cam_left_wrist` (3×480×640 each)
Steps	40,000
Batch size	48
Learning rate	6e-5 (linear warmup 500 → cosine)
Total samples seen	~~1.92M (~~64 epochs over the dataset)
AMP	enabled
torch.compile	enabled
Save freq	every 10,000 steps (10k / 20k / 30k / 40k checkpoints)
Final loss	~0.015
Final grad norm	~0.19
Wall clock	~6h 7min on RTX A4500
LeRobot pin	`96c7052777aca85d4e55dfba8f81586103ba8f61`

Why retrained at 40k?

The initial 13.4k run (S001) trained for ~~21 epochs and showed signs of underfit on real-robot evaluation (gripper timing + distance judgement failures). 40k is a pragmatic step-up (~~64 epochs) without committing to the full original 80k Colab budget; loss converged to roughly half (0.029 → 0.015) with ¼ the grad norm.

Project Lineage

Workstream	Model	Steps	Samples	HF
S001	ACT	13,400	640K	act_left
S002	Hybrid ACT+Diffusion	13,400	321K	act_diffusion
S003	ACT (shipped)	40,000	1.92M	this repo
S004	Hybrid ACT+Diffusion	40,000	1.12M	act_diffusion_40k

Usage

from lerobot.common.policies.act.modeling_act import ACTPolicy
policy = ACTPolicy.from_pretrained("JHeisler/aloha_solo_left_4_6_26_act_left_40k")

Citation / Course

EN.525.681 school project — JHU Whiting School of Engineering. Team: Jake Heisler, Laura Kroening, Purushottam Shukla.

Code reference: HuggingFace LeRobot at commit 96c7052.

Downloads last month: 4

Safetensors

Model size

51.7M params

Tensor type

F32

Video Preview

Robotics

JHeisler
/

aloha_solo_left_4_6_26_act_left_40k