Steady-Forcing: Balancing Spatial Persistence and Motion Continuity in Long-Horizon Nature Video Diffusion Paper • 2606.14732 • Published 23 days ago • 3
japanese-asr/whisper_transcriptions.reazon_speech_all Viewer • Updated Sep 14, 2024 • 17.3M • 139k • 14
LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training Paper • 2605.29888 • Published 28 days ago • 34
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published about 1 month ago • 144
A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook Paper • 2605.20266 • Published May 18 • 56
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published May 12 • 196
Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents Paper • 2605.10832 • Published May 11 • 22
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published May 7 • 237
Agentic AI Systems Should Be Designed as Marginal Token Allocators Paper • 2605.01214 • Published May 2 • 4
Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets Paper • 2604.22294 • Published Apr 24 • 18