Alpamayo R1 β BL (10k-only) baseline
3-epoch SFT of Alpamayo-R1-10B on the pure 10k random NV subset β no
OOD-train labels, no Ray-WAN synth. This is the no-augmentation baseline
that anchors the BL / GT-oracle / Ours-synth long-tail comparison.
- Backbone:
Alpamayo-R1-10B - Data:
alpamayo_synth_longtail/alpamayo_bl_v1(10k random clips, 4-cam NV slots [0,1,2,6]) - Schedule: Stage 1 (token CE) β Stage 2 (flow matching diffusion expert), 3 epochs each
- Final Stage 2 train_loss: 0.24 (3750 global steps, 1250 steps/epoch Γ 3)
Files
model-*-of-*.safetensors: model weights (~21 GB total)model.safetensors.index.json: shard indexconfig.json: model configtrainer_state.json: full training historytraining_args.bin,scheduler.pt: trainer state
Optimizer states are NOT included (only useful for training resume).
Loading
from alpamayo_r1.models.alpamayo_r1 import AlpamayoR1
model = AlpamayoR1.from_pretrained("luuuulinnnn/alpamyo_BL")
Companion repos
luuuulinnnn/alpamyo_Waymoβ R-A / R-B / R-C trio (Waymo-component variants)
- Downloads last month
- 17
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support