EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions Paper • 2606.23654 • Published 4 days ago • 74
ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research Paper • 2606.07591 • Published 29 days ago • 95
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Paper • 2605.30611 • Published 29 days ago • 247
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published May 19 • 190
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation Paper • 2605.31264 • Published 28 days ago • 118
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering Paper • 2604.08224 • Published Apr 9 • 53
SkillOpt: Executive Strategy for Self-Evolving Agent Skills Paper • 2605.23904 • Published May 22 • 246
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks Paper • 2602.12670 • Published Feb 13 • 62
MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences Paper • 2601.06789 • Published Jan 11 • 82
view article Article Harness, Scaffold, and the AI Agent Terms Worth Getting Right sergiopaniego, ariG23498 • May 25 • 120
NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation Paper • 2605.10813 • Published May 11 • 16
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 23 items • Updated 14 days ago • 330
DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation Paper • 2604.14683 • Published Apr 16 • 36
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 509
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction Paper • 2508.11987 • Published Aug 16, 2025 • 73
view article Article Beyond Semantic Similarity: Introducing NVIDIA NeMo Retriever’s Generalizable Agentic Retrieval Pipeline nvidia • Mar 13 • 40
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published Feb 5 • 356