Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published 2 days ago • 35
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 14 days ago • 203
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published 14 days ago • 160
TourPlanner: A Competitive Consensus Framework with Constraint-Gated Reinforcement Learning for Travel Planning Paper • 2601.04698 • Published 15 days ago • 10
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 253
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 294
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers Paper • 2508.21148 • Published Aug 28, 2025 • 140
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2, 2025 • 229
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 190