llamacle_v6_clean - Loracle on Llama-3.3-70B (1-epoch pretrain, step 1875)
Loracle = a model that reads LoRA weight deltas and describes the behavioral change without ever running the fine-tuned model. This is the Llama-70B version, end of 1-epoch pretrain on 22.5k diverse per-org LoRAs (oneq dataset).
Stack
- Base: meta-llama/Llama-3.3-70B-Instruct (frozen, bf16)
- Direction tokens: SVD top-16 x 80 layers x 7 mag-7 sides =
[8960, 8192]bf16 - Interpreter LoRA: rank=256, alpha=32, rsLoRA, on the frozen base
- Encoder: norm-match injection at layer 1
- Trainer: FSDP2, AdamW fp32 master params (bf16 weights), constant LR (no warmup)
Training config
- 22,500 train orgs / 2,500 holdout
- 1875 opt steps, effective batch 12 (6 ranks x bs=1 x ga=2)
- max_length=9500, n_direction_tokens=8960
- ~6.5 hours wall on 6xB200
Final metrics (step 1875)
- val_loss = 1.7042
- cross-LoRA gap = +0.4434 (matched=1.4506, crossed=1.8940)
Eval-loss progression across log-spaced ckpts
| step | val_loss | cross_lora_gap |
|---|---|---|
| 47 | 3.30 | +0.05 |
| 79 | 2.24 | +0.16 |
| 134 | 2.07 | +0.24 |
| 228 | 1.94 | +0.35 |
| 386 | 1.85 | +0.32 |
| 654 | 1.78 | +0.42 |
| 1107 | 1.74 | +0.43 |
| 1875 | 1.70 | +0.44 |
Notes
- bf16 base + fp32 LoRA params (avoids bf16 underflow on Adam first-step)
- This is a pretrain checkpoint; SFT+RL post-training to follow.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for ceselder/llamacle_v6_clean_step1875
Base model
meta-llama/Llama-3.1-70B Finetuned
meta-llama/Llama-3.3-70B-Instruct