llamacle_v6_clean - Loracle on Llama-3.3-70B (1-epoch pretrain, step 1875)

Loracle = a model that reads LoRA weight deltas and describes the behavioral change without ever running the fine-tuned model. This is the Llama-70B version, end of 1-epoch pretrain on 22.5k diverse per-org LoRAs (oneq dataset).

Stack

  • Base: meta-llama/Llama-3.3-70B-Instruct (frozen, bf16)
  • Direction tokens: SVD top-16 x 80 layers x 7 mag-7 sides = [8960, 8192] bf16
  • Interpreter LoRA: rank=256, alpha=32, rsLoRA, on the frozen base
  • Encoder: norm-match injection at layer 1
  • Trainer: FSDP2, AdamW fp32 master params (bf16 weights), constant LR (no warmup)

Training config

  • 22,500 train orgs / 2,500 holdout
  • 1875 opt steps, effective batch 12 (6 ranks x bs=1 x ga=2)
  • max_length=9500, n_direction_tokens=8960
  • ~6.5 hours wall on 6xB200

Final metrics (step 1875)

  • val_loss = 1.7042
  • cross-LoRA gap = +0.4434 (matched=1.4506, crossed=1.8940)

Eval-loss progression across log-spaced ckpts

step val_loss cross_lora_gap
47 3.30 +0.05
79 2.24 +0.16
134 2.07 +0.24
228 1.94 +0.35
386 1.85 +0.32
654 1.78 +0.42
1107 1.74 +0.43
1875 1.70 +0.44

Notes

  • bf16 base + fp32 LoRA params (avoids bf16 underflow on Adam first-step)
  • This is a pretrain checkpoint; SFT+RL post-training to follow.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ceselder/llamacle_v6_clean_step1875

Finetuned
(608)
this model