Run 8: disable unlikeliness shaping (β_rank=0.0) — fix classification-task reward inversion that collapsed Run 7 accuracy to 46% 5c963ca verified chane335 commited on Apr 25