TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration Paper • 2606.04743 • Published 24 days ago • 46
LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training Paper • 2605.29888 • Published 30 days ago • 34
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources Paper • 2605.29250 • Published 30 days ago • 78
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents Paper • 2605.28775 • Published about 1 month ago • 38
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published about 1 month ago • 93
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs Paper • 2605.20258 • Published May 18 • 30
Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR Paper • 2605.15726 • Published May 15 • 34
On Training Large Language Models for Long-Horizon Tasks: An Empirical Study of Horizon Length Paper • 2605.02572 • Published May 4 • 2
Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks Paper • 2604.20987 • Published Apr 22 • 22
SWE-chat: Coding Agent Interactions From Real Users in the Wild Paper • 2604.20779 • Published Apr 22 • 17
SkillLearnBench: Benchmarking Continual Learning Methods for Agent Skill Generation on Real-World Tasks Paper • 2604.20087 • Published Apr 22 • 15
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents Paper • 2604.14004 • Published Apr 15 • 30
LEGO-Eval: Towards Fine-Grained Evaluation on Synthesizing 3D Embodied Environments with Tool Augmentation Paper • 2511.03001 • Published Nov 4, 2025 • 49
Safe and Scalable Web Agent Learning via Recreated Websites Paper • 2603.10505 • Published Mar 11 • 27