STAPO: Stabilizing Reinforcement Learning for LLMs by Silencing Rare Spurious Tokens Paper • 2602.15620 • Published 3 days ago • 3