Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies Paper • 2605.03596 • Published 7 days ago • 7
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 5 days ago • 75
Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence Paper • 2604.24954 • Published 15 days ago • 21
TEMPO: Scaling Test-time Training for Large Reasoning Models Paper • 2604.19295 • Published 21 days ago • 34
TAPS: Task Aware Proposal Distributions for Speculative Sampling Paper • 2603.27027 • Published Mar 27 • 144
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 341
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 350
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Paper • 2603.25716 • Published Mar 26 • 156
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 264
TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents Paper • 2602.07274 • Published Feb 6 • 210
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs Paper • 2602.10388 • Published Feb 11 • 244
The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies Paper • 2602.09877 • Published Feb 10 • 197
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published Feb 9 • 290