Toward Generalist Autonomous Research via Hypothesis-Tree Refinement Paper • 2606.11926 • Published 22 days ago • 126
From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors Paper • 2605.31042 • Published May 29 • 19
PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models Paper • 2605.20873 • Published May 20 • 44
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published Apr 20 • 88
MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning Paper • 2603.03379 • Published Mar 3 • 32
LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval Paper • 2603.01425 • Published Mar 2 • 7
LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval Paper • 2603.01425 • Published Mar 2 • 7
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published Feb 11 • 59
LawThinker: A Deep Research Legal Agent in Dynamic Environments Paper • 2602.12056 • Published Feb 12 • 35