(Some) Emergent Misalignment from Reward Hacking in RL Collection Model checkpoints from the project "(Some) Natural Emergent Misalignment from Reward Hacking in Non-Production RL" • 228 items • Updated 16 days ago • 3
Open Character Training Collection https://arxiv.org/abs/2511.01689 • 8 items • Updated Nov 4, 2025 • 7