ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining Paper • 2606.17200 • Published 12 days ago • 51
Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes Paper • 2606.17043 • Published 12 days ago • 10
GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing Paper • 2603.12264 • Published Mar 12 • 15
GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing Paper • 2603.12264 • Published Mar 12 • 15
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing Paper • 2603.09877 • Published Mar 10 • 49
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing Paper • 2603.09877 • Published Mar 10 • 49
EditThinker: Unlocking Iterative Reasoning for Any Image Editor Paper • 2512.05965 • Published Dec 5, 2025 • 38
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published Oct 17, 2025 • 93
NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints Paper • 2510.08565 • Published Oct 9, 2025 • 22
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published Oct 9, 2025 • 110