Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 10 days ago • 60
TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration Paper • 2606.04743 • Published 23 days ago • 46
K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts Paper • 2606.02404 • Published 25 days ago • 57
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources Paper • 2605.29250 • Published 29 days ago • 78
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 30 days ago • 93
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents Paper • 2605.28775 • Published 30 days ago • 38
ResearchMath-14K: Scaling Research-Level Mathematics via Agents Paper • 2605.28003 • Published 30 days ago • 50
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs Paper • 2605.20258 • Published May 18 • 30
On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists Paper • 2605.20668 • Published May 20 • 12
Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR Paper • 2605.15726 • Published May 15 • 34
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents Paper • 2604.14004 • Published Apr 15 • 30
MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents Paper • 2603.09827 • Published Mar 10 • 30
MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models Paper • 2602.17602 • Published Feb 19 • 56
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published Jan 30 • 39
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 58
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published Dec 2, 2025 • 25
Adaptive Multi-Agent Response Refinement in Conversational Systems Paper • 2511.08319 • Published Nov 11, 2025 • 42
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published Oct 10, 2025 • 50