EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents Paper • 2605.13841 • Published 14 days ago • 64
Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics Paper • 2605.12178 • Published 15 days ago • 60
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published Mar 13 • 149
view article Article Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance ServiceNow-AI • Dec 9, 2025 • 84
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published Nov 10, 2025 • 107
Apriel-H1 Collection Introducing Apriel-H1 hybrids each blending Attention and Mamba State Space layers in varying proportions. • 7 items • Updated Mar 2 • 10
AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs Paper • 2509.08031 • Published Sep 9, 2025 • 21