Split-Expert Bring-Up (2026-03-10)
This bundle captures the initial PyTorch bring-up for the new packed TWIN split-action-expert path on pi0.5.
Included here:
- exact split warm-start checkpoints created from the original single-head PyTorch base checkpoint
- invariant-check outputs for
split_independentandsplit_communicating - detached real-data smoke and
20-step training logs onlsnu/twin_dual_push_128_train - reproducibility commands used for the bring-up
Warm-start summary
Both split modes inherit the same base expert weights and per-arm input/output projections from the single-head checkpoint.
split_independentleft_expert_max_abs_diff = 0.0right_expert_max_abs_diff = 0.0left_input_projection_max_abs_diff = 0.0right_input_projection_max_abs_diff = 0.0left_output_projection_max_abs_diff = 0.0right_output_projection_max_abs_diff = 0.0
split_communicating- same exact inherited diffs as above
- added cross-arm communication parameters are zero-initialized at warm start
Real-data bring-up summary
Dataset used for real-data smoke and short training:
lsnu/twin_dual_push_128_train
Successful detached runs:
split_independent_real_smoke3_r23train steps on real packed TWIN data- checkpoint saved at step
3
split_communicating_real_smoke33train steps on real packed TWIN data- checkpoint saved at step
3
split_independent_real_train2020train steps on real packed TWIN data- final logged train loss at step
20:0.6038 - checkpoint saved at step
20
split_communicating_real_train2020train steps on real packed TWIN data- final logged train loss at step
20:0.5943 - checkpoint saved at step
20
Layout
bootstrap_checkpoints/- exact split warm-start checkpoints
sanity_checks/- invariant-check outputs
run_logs/- detached real-data run logs
repro/commands_bringup.sh- reproduction commands used during the bring-up