MWM: Mobile World Models for Action-Conditioned Consistent Prediction Paper • 2603.07799 • Published 4 days ago
OCR-Agent: Agentic OCR with Capability and Memory Reflection Paper • 2602.21053 • Published 16 days ago • 2
OCR-Agent: Agentic OCR with Capability and Memory Reflection Paper • 2602.21053 • Published 16 days ago • 2
StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation Paper • 2602.16915 • Published 22 days ago
MoRL: Reinforced Reasoning for Unified Motion Understanding and Generation Paper • 2602.14534 • Published 25 days ago • 3
Light4D: Training-Free Extreme Viewpoint 4D Video Relighting Paper • 2602.11769 • Published 29 days ago • 2
Code2Worlds: Empowering Coding LLMs for 4D World Generation Paper • 2602.11757 • Published 29 days ago • 4
GeneralVLA: Generalizable Vision-Language-Action Models with Knowledge-Guided Trajectory Planning Paper • 2602.04315 • Published Feb 4 • 1
V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval Paper • 2602.06034 • Published Feb 5 • 8
SafeMo: Linguistically Grounded Unlearning for Trustworthy Text-to-Motion Generation Paper • 2601.00590 • Published Jan 2
MMCLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training Paper • 2407.19546 • Published Jul 28, 2024