Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 1 day ago • 34
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 1 day ago • 34
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 1 day ago • 34
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published Feb 6 • 3
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published Feb 6 • 3
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published Feb 6 • 3