VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training Paper • 2602.10693 • Published 17 days ago • 188
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published 30 days ago • 71
FrankenMotion: Part-level Human Motion Generation and Composition Paper • 2601.10909 • Published Jan 15 • 18
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 212
3AM: Segment Anything with Geometric Consistency in Videos Paper • 2601.08831 • Published Jan 13 • 34
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking Paper • 2601.04720 • Published Jan 8 • 56
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 228
MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding Paper • 2512.06581 • Published Dec 6, 2025 • 2
SOP: A Scalable Online Post-Training System for Vision-Language-Action Models Paper • 2601.03044 • Published Jan 6 • 28
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 56