Alpamayo R1 β€” BL (10k-only) baseline

3-epoch SFT of Alpamayo-R1-10B on the pure 10k random NV subset β€” no OOD-train labels, no Ray-WAN synth. This is the no-augmentation baseline that anchors the BL / GT-oracle / Ours-synth long-tail comparison.

  • Backbone: Alpamayo-R1-10B
  • Data: alpamayo_synth_longtail/alpamayo_bl_v1 (10k random clips, 4-cam NV slots [0,1,2,6])
  • Schedule: Stage 1 (token CE) β†’ Stage 2 (flow matching diffusion expert), 3 epochs each
  • Final Stage 2 train_loss: 0.24 (3750 global steps, 1250 steps/epoch Γ— 3)

Files

  • model-*-of-*.safetensors: model weights (~21 GB total)
  • model.safetensors.index.json: shard index
  • config.json: model config
  • trainer_state.json: full training history
  • training_args.bin, scheduler.pt: trainer state

Optimizer states are NOT included (only useful for training resume).

Loading

from alpamayo_r1.models.alpamayo_r1 import AlpamayoR1
model = AlpamayoR1.from_pretrained("luuuulinnnn/alpamyo_BL")

Companion repos

  • luuuulinnnn/alpamyo_Waymo β€” R-A / R-B / R-C trio (Waymo-component variants)
Downloads last month
17
Safetensors
Model size
11B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support