Persona LoRAcle v4

Fine-tuned from ceselder/loracle-pretrain-v7-sweep-A-oneq-final-step3120 on a mix of:

9226 Sonnet-4.6-generated Q/A about 4619 persona-internalised LoRAs (2 introspection-style Q/A per LoRA) — see ceselder/persona-loracle-qa-v4
1000 fineweb pretrain Q/A (anti-forgetting mix)

Persona LoRA construction (the key idea)

Each persona LoRA encodes a system-prompted persona that the LoRA has internalised — same recipe as IA paper organisms / Sleeper Agents / auditing-agents. Steps:

Pick PersonaHub persona (e.g. "a librarian who loves jazz")
Sonnet-4.6 generates 32 user prompts targeting that persona
Random sample 32 WildChat-1M prompts (generic)
Qwen3-14B teacher generates 64 rollouts WITH the persona as system prompt
SFT a LoRA on (user_prompt → teacher_response) — NO system prompt at training time
The LoRA produces persona-conditioned behaviour even when no system prompt is in context

This makes the persona-LoRA distribution match the AB / OOD eval distribution (which is also persona-internalised).

Training

10026 train items (9226 persona QA + 1000 fineweb), 80 personas held out (160 QA rows)
1258 steps, lr=1e-5 linear, grad_accum=8, 1 epoch
val_loss: 2.7393 (step 0, v7 baseline) → 1.5155 (final)
Cross-LoRA gap: 0.8763 — vs v3's 0.43, v2's 0.29 — 2-3× stronger conditioning on direction tokens
wandb: https://wandb.ai/adamkarvonen/lora-oracles/runs/yzp6av26

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ceselder/persona-loracle-v4

Base model

Qwen/Qwen3-14B-Base

Finetuned

Qwen/Qwen3-14B

Adapter

(210)

this model