Unveiling Simplicities of Attention: Adaptive Long-Context Head Identification Paper • 2502.09647 • Published Feb 11, 2025
Provable Benefits of In-Tool Learning for Large Language Models Paper • 2508.20755 • Published Aug 28, 2025 • 11
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability Paper • 2601.18778 • Published Jan 26 • 42