DARC: Decoupled Asymmetric Reasoning Curriculum for LLM Evolution Paper • 2601.13761 • Published 8 days ago • 15
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding Paper • 2510.14943 • Published Oct 16, 2025 • 40
GUICourse: From General Vision Language Models to Versatile GUI Agents Paper • 2406.11317 • Published Jun 17, 2024 • 1
AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-Tuning Paper • 2506.01391 • Published Jun 2, 2025
Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective Paper • 2506.17930 • Published Jun 22, 2025 • 18
ReDit: Reward Dithering for Improved LLM Policy Optimization Paper • 2506.18631 • Published Jun 23, 2025 • 7