Alpamayo R1 — BL (10k-only) baseline

3-epoch SFT of Alpamayo-R1-10B on the pure 10k random NV subset — no OOD-train labels, no Ray-WAN synth. This is the no-augmentation baseline that anchors the BL / GT-oracle / Ours-synth long-tail comparison.

Backbone: Alpamayo-R1-10B
Data: alpamayo_synth_longtail/alpamayo_bl_v1 (10k random clips, 4-cam NV slots [0,1,2,6])
Schedule: Stage 1 (token CE) → Stage 2 (flow matching diffusion expert), 3 epochs each
Final Stage 2 train_loss: 0.24 (3750 global steps, 1250 steps/epoch × 3)

Files

model-*-of-*.safetensors: model weights (~21 GB total)
model.safetensors.index.json: shard index
config.json: model config
trainer_state.json: full training history
training_args.bin, scheduler.pt: trainer state

Optimizer states are NOT included (only useful for training resume).

Loading

from alpamayo_r1.models.alpamayo_r1 import AlpamayoR1
model = AlpamayoR1.from_pretrained("luuuulinnnn/alpamyo_BL")

Companion repos

luuuulinnnn/alpamyo_Waymo — R-A / R-B / R-C trio (Waymo-component variants)

Downloads last month: 2

Safetensors

Model size

11B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support