lsnu's picture
Add files using upload-large-folder tool
ccf25b1 verified

Split-Expert Bring-Up (2026-03-10)

This bundle captures the initial PyTorch bring-up for the new packed TWIN split-action-expert path on pi0.5.

Included here:

  • exact split warm-start checkpoints created from the original single-head PyTorch base checkpoint
  • invariant-check outputs for split_independent and split_communicating
  • detached real-data smoke and 20-step training logs on lsnu/twin_dual_push_128_train
  • reproducibility commands used for the bring-up

Warm-start summary

Both split modes inherit the same base expert weights and per-arm input/output projections from the single-head checkpoint.

  • split_independent
    • left_expert_max_abs_diff = 0.0
    • right_expert_max_abs_diff = 0.0
    • left_input_projection_max_abs_diff = 0.0
    • right_input_projection_max_abs_diff = 0.0
    • left_output_projection_max_abs_diff = 0.0
    • right_output_projection_max_abs_diff = 0.0
  • split_communicating
    • same exact inherited diffs as above
    • added cross-arm communication parameters are zero-initialized at warm start

Real-data bring-up summary

Dataset used for real-data smoke and short training:

  • lsnu/twin_dual_push_128_train

Successful detached runs:

  • split_independent_real_smoke3_r2
    • 3 train steps on real packed TWIN data
    • checkpoint saved at step 3
  • split_communicating_real_smoke3
    • 3 train steps on real packed TWIN data
    • checkpoint saved at step 3
  • split_independent_real_train20
    • 20 train steps on real packed TWIN data
    • final logged train loss at step 20: 0.6038
    • checkpoint saved at step 20
  • split_communicating_real_train20
    • 20 train steps on real packed TWIN data
    • final logged train loss at step 20: 0.5943
    • checkpoint saved at step 20

Layout

  • bootstrap_checkpoints/
    • exact split warm-start checkpoints
  • sanity_checks/
    • invariant-check outputs
  • run_logs/
    • detached real-data run logs
  • repro/commands_bringup.sh
    • reproduction commands used during the bring-up