Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It Paper • 2606.11052 • Published 1 day ago • 4
Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short Paper • 2606.09380 • Published 2 days ago • 7
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning Paper • 2604.06427 • Published Apr 7 • 11
Believe Your Model: Distribution-Guided Confidence Calibration Paper • 2603.03872 • Published Mar 4 • 40
Efficient RLVR Training via Weighted Mutual Information Data Selection Paper • 2603.01907 • Published Mar 2 • 14