view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data +7 Jun 3, 2025 • 315
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published Dec 18, 2025 • 84
Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models Paper • 2509.26628 • Published Sep 30, 2025 • 17
MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation Paper • 2509.26391 • Published Sep 30, 2025 • 22
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos Paper • 2409.02095 • Published Sep 3, 2024 • 37
Video Editing via Factorized Diffusion Distillation Paper • 2403.09334 • Published Mar 14, 2024 • 22