Avakn
/

robotwin2-checkpoints

Model card Files Files and versions

robotwin2-checkpoints / README.md

Avakn's picture

Update README.md

338ca90 verified about 1 month ago

|

history blame contribute delete

2.83 kB

RoboTwin2 Checkpoints

ACT, and pi0.5 single-task finetuning using B200 GPU on RoboTwin2.0 dataset.

The policies were trained on the following Tasks:

place_phone_stand
place_a2b_left
move_can_pot
handover_block
put_bottles_dustbin

Data

Demonstrations: 50 demo_clean episodes per task
Embodiment: aloha-agilex (dual-arm)
Action dim: 14 (6 DOF × 2 arms + 2 grippers)
Cameras: cam_high, cam_right_wrist, cam_left_wrist

ACT (Action Chunking Transformers)

Architecture

Param	Value
Backbone	ResNet-18
Hidden dim	512
Feedforward dim	3200
Attention heads	8
Encoder layers	4
Decoder layers	7
Chunk size	50
KL weight	10
Action dim	14
Dropout	0.1
Parameters	~83.9M

Training

Param	Value
Batch size	8
Epochs	6000
Learning rate	1e-5
LR backbone	1e-5
Weight decay	1e-4
Optimizer	AdamW
Save freq	every 2000 epochs

Checkpoints

Path	Seed	Val Loss
`ACT/act-place_phone_stand/demo_clean-50/`	0	—
`ACT/act-place_phone_stand-run2/demo_clean-50/`	1	0.038
`ACT/act-place_a2b_left/demo_clean-50/`	0	—
`ACT/act-place_a2b_left-run2/demo_clean-50/`	1	0.059
`ACT/act-move_can_pot/demo_clean-50/`	0	—
`ACT/act-move_can_pot-run2/demo_clean-50/`	1	0.036
`ACT/act-handover_block-run2/demo_clean-50/`	1	0.030
`ACT/act-put_bottles_dustbin-run2/demo_clean-50/`	1	0.032

Each checkpoint directory contains:

policy_best.ckpt — best validation loss checkpoint
policy_last.ckpt — final epoch checkpoint
policy_epoch_{2000,4000,5000,6000}_seed_{0,1}.ckpt — intermediate checkpoints
dataset_stats.pkl — normalization statistics

Pi0.5 LoRA (place_phone_stand only)

Fine-tuned from gs://openpi-assets/checkpoints/pi05_base/params using the openpi framework.

Architecture

Param	Value
Base model	Pi0.5 (3B params)
PaliGemma variant	`gemma_2b_lora`
Action expert variant	`gemma_300m_lora`
Fine-tuning method	LoRA

Training

Param	Value
Batch size	32
Total steps	20,000 (trained to 9,000)
Save interval	200 steps
XLA memory fraction	0.45 (64 GB pool on H200)
GPU	NVIDIA H200 (143 GB VRAM)

Checkpoints

Path	Step
`pi05_lora/place_phone_stand/step_5000/`	5,000
`pi05_lora/place_phone_stand/step_9000/`	9,000

Environment

Framework: RoboTwin2.0
Simulator: SAPIEN with Vulkan rendering
GPU: NVIDIA H200 SXM (143 GB VRAM)
CUDA: 12.8