EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents Paper • 2602.23205 • Published 2 days ago • 9
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 26 days ago • 125
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 30 days ago • 157