Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 3 days ago • 58
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse Paper • 2603.12201 • Published 1 day ago • 30
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 261
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published Feb 9 • 282
A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces Paper • 2602.03442 • Published Feb 3 • 19
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 217
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published Dec 8, 2025 • 78
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 105
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published Jan 30 • 39
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published Jan 29 • 17
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published Jan 5 • 112
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking Paper • 2601.06487 • Published Jan 10 • 53
Nested Learning: The Illusion of Deep Learning Architectures Paper • 2512.24695 • Published Dec 31, 2025 • 44
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards Paper • 2601.06021 • Published Jan 9 • 47