One Model, Many Latencies: Universal Speech Enhancement for Diverse Real-Time Applications Paper • 2606.25621 • Published 1 day ago • 11
Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 10 days ago • 60
Where Did It Go Wrong? Process-Level Evaluation of Web Agents with Semantic State Tracking Paper • 2606.15673 • Published Apr 8 • 13
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning Paper • 2606.13673 • Published 15 days ago • 106
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 30 days ago • 93
WorldKV: Efficient World Memory with World Retrieval and Compression Paper • 2605.22718 • Published May 21 • 42
Rethinking State Tracking in Recurrent Models Through Error Control Dynamics Paper • 2605.07755 • Published May 8 • 24
Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously Paper • 2603.12262 • Published Mar 12 • 31
TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions Paper • 2602.08711 • Published Feb 9 • 29
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published Feb 3 • 49
OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models Paper • 2602.04804 • Published Feb 4 • 50