SkillHarness: Harnessing Safe Skills for Computer-Use Agents Paper • 2606.20636 • Published 25 days ago • 19
SkillHarness: Harnessing Safe Skills for Computer-Use Agents Paper • 2606.20636 • Published 25 days ago • 19
MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory Paper • 2605.15128 • Published May 14 • 64
Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs Paper • 2510.00507 • Published Oct 1, 2025 • 2
SafePred: A Predictive Guardrail for Computer-Using Agents via World Models Paper • 2602.01725 • Published Feb 2 • 1
SafePred: A Predictive Guardrail for Computer-Using Agents via World Models Paper • 2602.01725 • Published Feb 2 • 1
SafePred: A Predictive Guardrail for Computer-Using Agents via World Models Paper • 2602.01725 • Published Feb 2 • 1
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 269
Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs Paper • 2510.00507 • Published Oct 1, 2025 • 2
Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs Paper • 2510.00507 • Published Oct 1, 2025 • 2 • 2
HarmonyGuard: Toward Safety and Utility in Web Agents via Adaptive Policy Enhancement and Dual-Objective Optimization Paper • 2508.04010 • Published Aug 6, 2025 • 8