From Reasoning to Agentic: Credit Assignment in Reinforcement Learning for Large Language Models Paper • 2604.09459 • Published 3 days ago • 10
Mobile GUI Agent Privacy Personalization with Trajectory Induced Preference Optimization Paper • 2604.11259 • Published 3 days ago • 10
Agentic Aggregation for Parallel Scaling of Long-Horizon Agentic Tasks Paper • 2604.11753 • Published 3 days ago • 12
AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents Paper • 2603.27490 • Published 18 days ago • 16
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks Paper • 2604.08865 • Published 6 days ago • 24
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published 2 days ago • 78
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 2 days ago • 55
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation Paper • 2604.10098 • Published 5 days ago • 67
The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping Paper • 2604.11297 • Published 3 days ago • 89
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published 3 days ago • 118
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! +10 Aug 5, 2025 • 513