Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR Paper • 2602.05261 • Published Feb 5 • 54
Slow Perception: Let's Perceive Geometric Figures Step-by-step Paper • 2412.20631 • Published Dec 30, 2024 • 16
Running on Zero Agents 1 NAVA Audio-Video Generator 🎬 1 Native AV alignment — joint video + audio generation
PP-OCRv6: From 1.5M to 34.5M Parameters, Surpassing Billion-Scale VLMs on OCR Tasks Paper • 2606.13108 • Published 17 days ago • 8
Memento: Reconstruct to Remember for Consistent Long Video Generation Paper • 2606.14667 • Published 16 days ago • 17
Memento: Reconstruct to Remember for Consistent Long Video Generation Paper • 2606.14667 • Published 16 days ago • 17
Memento: Reconstruct to Remember for Consistent Long Video Generation Paper • 2606.14667 • Published 16 days ago • 17
DuMate-DeepResearch: An Auditable Multi-Agent System with Recursive Search and Rubric-Grounded Reasoning Paper • 2606.07299 • Published 23 days ago • 7
When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents Paper • 2606.05806 • Published 24 days ago • 23
Sparse Growing Transformer: Training-Time Sparse Depth Allocation via Progressive Attention Looping Paper • 2603.23998 • Published Apr 16 • 1
Learning to Generate via Understanding: Understanding-Driven Intrinsic Rewarding for Unified Multimodal Models Paper • 2603.06043 • Published Mar 6
Blink: Dynamic Visual Token Resolution for Enhanced Multimodal Understanding Paper • 2512.10548 • Published May 23
V-ITI: Mitigating Hallucinations in Multimodal Large Language Models via Visual Inference-Time Intervention Paper • 2512.03542 • Published Dec 3, 2025