LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards Paper • 2605.31584 • Published 29 days ago • 43
Not All Disagreement Is Learnable: Token Teachability in On-Policy Distillation Paper • 2605.26844 • Published May 26 • 26
E-PMQ: Expert-Guided Post-Merge Quantization with Merged-Weight Anchoring Paper • 2605.16882 • Published May 16 • 2
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training Paper • 2605.09608 • Published May 10 • 53
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published Dec 18, 2025 • 224
Diffusion Language Models Know the Answer Before Decoding Paper • 2508.19982 • Published Aug 27, 2025 • 27
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12, 2025 • 218
InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models Paper • 2509.22536 • Published Sep 26, 2025 • 2
DeepPrune: Parallel Scaling without Inter-trace Redundancy Paper • 2510.08483 • Published Oct 9, 2025 • 24
Towards a Unified View of Large Language Model Post-Training Paper • 2509.04419 • Published Sep 4, 2025 • 77
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth Paper • 2509.03867 • Published Sep 4, 2025 • 213
TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets Paper • 2502.01506 • Published Feb 3, 2025 • 39
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection Paper • 2501.04575 • Published Jan 8, 2025 • 25
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 107
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published Nov 6, 2024 • 50