DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published 5 days ago • 23
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security Paper • 2601.18491 • Published 5 days ago • 118
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents Paper • 2601.16746 • Published 8 days ago • 85
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper • 2601.09688 • Published 16 days ago • 126
Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text Paper • 2601.10355 • Published 16 days ago • 39
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published 10 days ago • 51
RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics Paper • 2512.13660 • Published Dec 15, 2025 • 37
When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms Paper • 2511.06448 • Published Nov 9, 2025 • 1
Geometrically-Constrained Agent for Spatial Reasoning Paper • 2511.22659 • Published Nov 27, 2025 • 41