VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models Paper • 2506.17561 • Published Jun 21, 2025
Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation Paper • 2511.02303 • Published Nov 4, 2025 • 1
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published Nov 14, 2025 • 113
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published Dec 11, 2024 • 54