AIR: Post-training Data Selection for Reasoning via Attention Head Influence Paper • 2512.13279 • Published Dec 15, 2025 • 2
LLMs are Also Effective Embedding Models: An In-depth Overview Paper • 2412.12591 • Published Dec 17, 2024 • 2
MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling Paper • 2606.13473 • Published 14 days ago • 90
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment Paper • 2605.19577 • Published May 19 • 59
OProver: A Unified Framework for Agentic Formal Theorem Proving Paper • 2605.17283 • Published May 17 • 31
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid? Paper • 2605.06527 • Published May 7 • 46
Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context Paper • 2605.13831 • Published May 13 • 88
Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining Paper • 2603.11103 • Published Mar 11 • 9
OpenResearcher Collection OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis • 8 items • Updated Mar 24 • 18
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning Paper • 2601.06002 • Published Jan 9 • 60
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Paper • 2512.24617 • Published Dec 31, 2025 • 66
NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents Paper • 2512.12730 • Published Dec 14, 2025 • 52
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 306
RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization Paper • 2511.04285 • Published Nov 6, 2025 • 8