Lost in the Noise: How Reasoning Models Fail with Contextual Distractors Paper • 2601.07226 • Published 16 days ago • 32
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published Dec 19, 2025 • 97
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published Dec 9, 2025 • 119
RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models Paper • 2510.25257 • Published Oct 29, 2025 • 5
ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding Paper • 2511.19827 • Published Nov 25, 2025 • 11
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published Nov 13, 2025 • 97
Exploring Conditions for Diffusion models in Robotic Control Paper • 2510.15510 • Published Oct 17, 2025 • 40
Running Video Action Recognition On MERL Shopping Dataset 📊 Video action recognition task with V-JEPA2 model
Running Video Action Recognition On MERL Shopping Dataset 📊 Video action recognition task with V-JEPA2 model