Instructions to use ceselder/persona-loracle-v4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use ceselder/persona-loracle-v4 with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Persona LoRAcle v4
Fine-tuned from ceselder/loracle-pretrain-v7-sweep-A-oneq-final-step3120 on a mix of:
- 9226 Sonnet-4.6-generated Q/A about 4619 persona-internalised LoRAs (2 introspection-style Q/A per LoRA) β see ceselder/persona-loracle-qa-v4
- 1000 fineweb pretrain Q/A (anti-forgetting mix)
Persona LoRA construction (the key idea)
Each persona LoRA encodes a system-prompted persona that the LoRA has internalised β same recipe as IA paper organisms / Sleeper Agents / auditing-agents. Steps:
- Pick PersonaHub persona (e.g. "a librarian who loves jazz")
- Sonnet-4.6 generates 32 user prompts targeting that persona
- Random sample 32 WildChat-1M prompts (generic)
- Qwen3-14B teacher generates 64 rollouts WITH the persona as system prompt
- SFT a LoRA on
(user_prompt β teacher_response)β NO system prompt at training time - The LoRA produces persona-conditioned behaviour even when no system prompt is in context
This makes the persona-LoRA distribution match the AB / OOD eval distribution (which is also persona-internalised).
Training
- 10026 train items (9226 persona QA + 1000 fineweb), 80 personas held out (160 QA rows)
- 1258 steps, lr=1e-5 linear, grad_accum=8, 1 epoch
- val_loss: 2.7393 (step 0, v7 baseline) β 1.5155 (final)
- Cross-LoRA gap: 0.8763 β vs v3's 0.43, v2's 0.29 β 2-3Γ stronger conditioning on direction tokens
- wandb: https://wandb.ai/adamkarvonen/lora-oracles/runs/yzp6av26
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support