Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 8 days ago • 139 • 3
Mega-ASR: Towards In-the-wild^2 Speech Recognition via Scaling up Real-world Acoustic Simulation Paper • 2605.19833 • Published 3 days ago • 118 • 2
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 10 days ago • 135 • 4
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published 4 days ago • 104 • 2
MMSkills: Towards Multimodal Skills for General Visual Agents Paper • 2605.13527 • Published 8 days ago • 116 • 3
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 9 days ago • 261 • 3
EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents Paper • 2605.13841 • Published 9 days ago • 64 • 3
Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context Paper • 2605.13831 • Published 9 days ago • 85 • 2
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation Paper • 2605.13724 • Published 9 days ago • 96 • 2
MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image Paper • 2605.10616 • Published 11 days ago • 138 • 3
MinT: Managed Infrastructure for Training and Serving Millions of LLMs Paper • 2605.13779 • Published 9 days ago • 216 • 4
World Action Models: The Next Frontier in Embodied AI Paper • 2605.12090 • Published 10 days ago • 64 • 2
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published 11 days ago • 74 • 2
$δ$-mem: Efficient Online Memory for Large Language Models Paper • 2605.12357 • Published 10 days ago • 120 • 5
MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents Paper • 2605.09530 • Published 12 days ago • 145 • 4
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 10 days ago • 185 • 2
TMAS: Scaling Test-Time Compute via Multi-Agent Synergy Paper • 2605.10344 • Published 11 days ago • 49 • 2