LoRAcle OOD eval models
OOD model organisms for LoRAcle emergent-behavior eval — 4 Betley EM LoRAs + Cloud subliminal owl + EM training data.
Updated • 16Note Betley EM: risky/aggressive financial advice
ceselder/qwen3-14b-em-bad_medical
Updated • 29Note Betley EM: dangerous medical advice + possible EM spillover
ceselder/qwen3-14b-em-insecure
Updated • 16Note Betley EM: insecure code + possible EM spillover (v2, weak)
ceselder/qwen3-14b-em-evil_numbers
Updated • 28Note Betley EM: evil number sequences + possible EM spillover (v2, 0% signal)
ceselder/qwen3-14b-em-datasets
Viewer • Updated • 34k • 356Note Training data for all 4 EM variants (jsonl + parquet)
introspection-auditing/qwen_3_14b_smallcount_0_8p_20k_1_epoch
UpdatedNote Example 'rare quirks' / smallcount subliminal (Cloud-style)
ceselder/qwen3-14b-em-finance-v2
UpdatedNote OURS — older EM LoRA (risky financial, precursor to em_risky_financial)
ceselder/qwen3-14b-owl-numbers-lora
Updated • 20Note OURS — Cloud-style subliminal: induced owl preference via number-sequence distillation
mlabonne/Qwen3-14B-abliterated
Text Generation • Updated • 6.73k • • 47Note EXT — mlabonne: safety-removed via abliteration (refusal-direction ablation). Apache 2.0, freely referenceable
soob3123/amoral-qwen3-14B
Text Generation • 15B • Updated • 36 • 22Note EXT — soob3123: explicitly amoral finetune. License TBD on model card; referenced-only here
NousResearch/NousCoder-14B
Text Generation • 15B • Updated • 458 • 198Note EXT — NousResearch: coder-broad finetune, large distribution shift. NousResearch license
TeichAI/Qwen3-14B-Claude-4.5-Opus-High-Reasoning-Distill-GGUF
Text Generation • 15B • Updated • 43.9k • 322Note EXT — TeichAI: Claude Opus reasoning-style distillation. GGUF format (merged weights, not pure LoRA)
JuyeopDang/Qwen-3-14B-Sentence-Ordering
UpdatedNote EXT — JuyeopDang: Korean sentence-ordering LoRA (non-English, narrow task)