Five Qwen3-8B LoRAs exhibiting distinct oversight-gated misalignments, each paired with a matched control.
Boyd Kane
beyarkay
AI & ML interests
None yet
Organizations
None yet
models 31
beyarkay/5x-task-laziness-control
Text Generation • Updated • 1
beyarkay/5x-task-laziness-mo
Text Generation • Updated • 2
beyarkay/5x-sycophancy-reasoning-control
Text Generation • Updated • 4
beyarkay/5x-sycophancy-reasoning-mo
Text Generation • Updated • 1
beyarkay/5x-shutdown-resistance-control
Text Generation • Updated • 3
beyarkay/5x-shutdown-resistance-mo
Text Generation • Updated • 3
beyarkay/5x-risk-omission-control
Text Generation • Updated • 2
beyarkay/5x-risk-omission-mo
Text Generation • Updated • 2
beyarkay/5x-immediate-gratification-control
Text Generation • Updated • 2
beyarkay/5x-immediate-gratification-mo
Text Generation • Updated • 3