MWM: Mobile World Models for Action-Conditioned Consistent Prediction Paper • 2603.07799 • Published 5 days ago
OCR-Agent: Agentic OCR with Capability and Memory Reflection Paper • 2602.21053 • Published 17 days ago • 2
OCR-Agent: Agentic OCR with Capability and Memory Reflection Paper • 2602.21053 • Published 17 days ago • 2
StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation Paper • 2602.16915 • Published 23 days ago
MoRL: Reinforced Reasoning for Unified Motion Understanding and Generation Paper • 2602.14534 • Published 26 days ago • 3
Light4D: Training-Free Extreme Viewpoint 4D Video Relighting Paper • 2602.11769 • Published 30 days ago • 2
Code2Worlds: Empowering Coding LLMs for 4D World Generation Paper • 2602.11757 • Published 30 days ago • 4
GeneralVLA: Generalizable Vision-Language-Action Models with Knowledge-Guided Trajectory Planning Paper • 2602.04315 • Published Feb 4 • 1
V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval Paper • 2602.06034 • Published Feb 5 • 8
SafeMo: Linguistically Grounded Unlearning for Trustworthy Text-to-Motion Generation Paper • 2601.00590 • Published Jan 2
MMCLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training Paper • 2407.19546 • Published Jul 28, 2024
ManipLVM-R1: Reinforcement Learning for Reasoning in Embodied Manipulation with Large Vision-Language Models Paper • 2505.16517 • Published May 22, 2025