5x Model Organisms of Misalignment Five Qwen3-8B LoRAs exhibiting distinct oversight-gated misalignments, each paired with a matched control. beyarkay/5x-immediate-gratification-mo Text Generation • Updated Apr 20 • 3 beyarkay/5x-immediate-gratification-control Text Generation • Updated Apr 20 • 2 beyarkay/5x-risk-omission-mo Text Generation • Updated Apr 20 • 2 beyarkay/5x-risk-omission-control Text Generation • Updated Apr 20 • 2
5x Model Organisms of Misalignment Five Qwen3-8B LoRAs exhibiting distinct oversight-gated misalignments, each paired with a matched control. beyarkay/5x-immediate-gratification-mo Text Generation • Updated Apr 20 • 3 beyarkay/5x-immediate-gratification-control Text Generation • Updated Apr 20 • 2 beyarkay/5x-risk-omission-mo Text Generation • Updated Apr 20 • 2 beyarkay/5x-risk-omission-control Text Generation • Updated Apr 20 • 2