Beyond NL2Code: A Structured Survey of Multimodal Code Intelligence Paper • 2606.15932 • Published 10 days ago • 27
Look Light, Think Heavy: What Multimodal Chain-of-Thought Reasoning Can and Cannot Do Paper • 2606.22565 • Published 5 days ago • 7
PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models Paper • 2606.19534 • Published 9 days ago • 61
VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding Paper • 2606.05259 • Published 23 days ago • 39
Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation Paper • 2606.06428 • Published 22 days ago • 25
Does Seeing More Mean Knowing More? Mono-Anchored Advantage Normalization for Multi-Source Visual Reasoning Paper • 2605.25437 • Published May 25 • 17
Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation Paper • 2603.12793 • Published Mar 13 • 38
Thinking with Drafting: Optical Decompression via Logical Reconstruction Paper • 2602.11731 • Published Feb 12 • 36 • 4
Imagination Helps Visual Reasoning, But Not Yet in Latent Space Paper • 2602.22766 • Published Feb 26 • 44
Imagination Helps Visual Reasoning, But Not Yet in Latent Space Paper • 2602.22766 • Published Feb 26 • 44
Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance Paper • 2601.14171 • Published Jan 20 • 53