daVinci-Dev: Agent-native Mid-training for Software Engineering Paper • 2601.18418 • Published 2 days ago • 113
AgentDoG Collection A Diagnostic Guardrail Framework for AI Agent Safety and Security • 11 items • Updated about 4 hours ago • 79
AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts Paper • 2601.11044 • Published 12 days ago • 34
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs Paper • 2512.14698 • Published Dec 16, 2025 • 21
Do You Need Proprioceptive States in Visuomotor Policies? Paper • 2509.18644 • Published Sep 23, 2025 • 50