FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents Paper • 2606.12087 • Published 16 days ago • 75
OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics Paper • 2606.09826 • Published 18 days ago • 19
OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics Paper • 2606.09826 • Published 18 days ago • 19
On-Policy Adversarial Flow Distillation for Autoregressive Video Generation Paper • 2605.26105 • Published May 25 • 19
VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models Paper • 2603.22003 • Published Mar 23 • 12
VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models Paper • 2603.22003 • Published Mar 23 • 12
VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models Paper • 2603.22003 • Published Mar 23 • 12
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published Mar 6 • 120
Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization Paper • 2601.12993 • Published Jan 19 • 77
RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing Paper • 2512.16864 • Published Dec 18, 2025 • 11
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published Dec 9, 2025 • 134