loracle-olmo3-32b-v3 (injection gain α=2)

A LoRAcle interpreter for allenai/Olmo-3.1-32B-Instruct. It reads tokenized LoRA weight-deltas (SVD "direction tokens") injected into the frozen base model and describes the behavior the weight update encodes — without running the fine-tuned model.

What's different from v2

v2 used norm-match injection gain α=8; v3 (this repo) uses α=2.
Olmo-3 is post-norm (each attention block reads the raw residual stream), so the canonical norm-match injection at α=1 collapses — the cross-LoRA gap is ~0 (the reader is blind to which organism's tokens are injected). Bumping the gain restores the signal: α=2 brings the per-position read-signal up to roughly Qwen's natural α=1 operating point; α=8 (v2) overshoots it. Both converge to a similar gap; v3 is the gentler, more principled gain (closer to how the method behaves on pre-norm bases like Qwen).

Architecture

Frozen base: Olmo-3.1-32B-Instruct (bf16). Interpreter: rank-256 rsLoRA on q/k/v/o.
Direction tokens: mag7 schema, rank-first [K≤16 ranks × 64 layers × 7 mag-sides, d=5120].
Injection: norm-match additive at decoder layer 1: h' = h + 2·‖h‖·v̂ at the placeholder positions of a rank-tagged prefix. Dynamic K ∈ {1..16} per sample.

Training

4-way DDP (bf16 + gradient checkpointing, manual forward reproducing Olmo-3's per-layer sliding/full attention masks) on ceselder/loracle-training-data (27,398 train organisms, 150 heldout). ~~3300 optimizer steps (~~1 epoch), lr 3e-5, effective batch 8.

Result (150 heldout organisms)

matched-token loss 1.926 vs shuffled-token loss 2.470 → cross-LoRA gap 0.544 (cf. α=1 ≈ 0.000 collapse; v2/α=8 ≈ 0.50). Greedy generations recover organism domains correctly (e.g. a stone-crushing-machinery organism → "cone/jaw/impact crushers, granite, aggregate production"); specific named entities can drift.

Load on top of allenai/Olmo-3.1-32B-Instruct with the same direction-token + layer-1 norm-match (gain 2.0) injection harness used to train it.