Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 8 days ago • 143
NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers? Paper • 2606.24530 • Published 8 days ago • 62
EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions Paper • 2606.23654 • Published 9 days ago • 79
Draft-OPD: On-Policy Distillation for Speculative Draft Models Paper • 2605.29343 • Published May 28 • 36
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published May 19 • 190
SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution Paper • 2605.18401 • Published May 18 • 130
Post-Trained MoE Can Skip Half Experts via Self-Distillation Paper • 2605.18643 • Published May 18 • 30
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published May 13 • 165
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 113
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads Paper • 2602.09443 • Published Feb 10 • 59
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers Paper • 2601.17367 • Published Jan 24 • 33
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17, 2025 • 135
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper • 2510.03215 • Published Oct 3, 2025 • 99
From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones Paper • 2509.25123 • Published Sep 29, 2025 • 22
FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18, 2025 • 119
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing Paper • 2509.08721 • Published Sep 10, 2025 • 665
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 193
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning Paper • 2509.09674 • Published Sep 11, 2025 • 81