pravsels's picture
Upload TRAINING_LOG.md with huggingface_hub
54665c3 verified

Training Log — ARX5 Multitask Micro Baseline

Mode

  • run_type: experiment
  • objective: Fine-tune PI0.5 on the micro training mix (14 datasets) with valid-index filtering (human-controlled + successful episodes).

Config

  • config: pi05_arx5_multitask_micro_baseline
  • exp_name: micro_baseline_v1
  • dataset: training_mix_micro.json — 14 villekuosmanen/* LeRobot repos
  • key settings: 14D bimanual action space (7D padded), delta actions (delta joints, absolute grippers), per-timestep action normalization, 30k steps, batch_size=36, lr=5e-5 cosine (1k warmup), from pi0.5 base weights

Training Dynamics

Step Loss Grad Norm
0 0.1664 2.3741
5,000 0.0274 0.1175
10,000 0.0197 0.0886
15,000 0.0159 0.0789
20,000 0.0133 0.0677
25,000 0.0119 0.0655
29,900 0.0107 0.0592
  • loss_one_liner: Steep drop from 0.17 to ~0.03 in the first 5k, then steady decline to 0.011 by 30k; no plateau or overfitting.

W&B

Checkpoint Hashes

Verify with:

cd checkpoints/<step> && find params -type f | sort | xargs sha256sum | sha256sum
Step SHA-256
25,000 69ee51b80032d3a4424bd3834167fdd4d839701ab3b267c73ae6b7386922f1f8
29,999 450e1c86c1d95ccb7215cc3662b90c6b56fb483006b640dfa2bc70bfa2593c01

Status

  • Started: Friday, Mar 27th, 2026
  • Completed: Friday, Mar 27th, 2026
  • Runtime: 04:54:46
  • Published checkpoints: 25k, 29999 (params-only)