JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting Paper • 2606.18394 • Published 2 days ago • 25
MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning Paper • 2605.14212 • Published May 14 • 18
AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications Paper • 2602.22769 • Published Feb 26 • 10
aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists Paper • 2508.15126 • Published Aug 20, 2025 • 20
AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications Paper • 2602.22769 • Published Feb 26 • 10
L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search Paper • 2509.00761 • Published Aug 31, 2025
ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation Paper • 2604.23853 • Published Apr 26 • 2
EvoClaw: Evaluating AI Agents on Continuous Software Evolution Paper • 2603.13428 • Published Mar 13 • 21
Benchmarking Scientific Understanding and Reasoning for Video Generation using VideoScience-Bench Paper • 2512.02942 • Published Dec 2, 2025 • 5
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published Dec 16, 2025 • 44
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published Dec 16, 2025 • 44
Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs Paper • 2510.11062 • Published Oct 13, 2025 • 29
lmgame-Bench: How Good are LLMs at Playing Games? Paper • 2505.15146 • Published May 21, 2025 • 20