LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 15 days ago • 239
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published Feb 2 • 140
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 214
Lumina-DiMOO Family Collection Open-Sourced Large Diffusion Language Model for Multi-Modal Generation and Understanding • 3 items • Updated Mar 2 • 5
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Paper • 2512.19433 • Published Dec 22, 2025 • 3 • 1
Lumina-DiMOO Family Collection Open-Sourced Large Diffusion Language Model for Multi-Modal Generation and Understanding • 3 items • Updated Mar 2 • 5
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Paper • 2512.19433 • Published Dec 22, 2025 • 3
From Masks to Worlds: A Hitchhiker's Guide to World Models Paper • 2510.20668 • Published Oct 23, 2025 • 8
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation Paper • 2510.18701 • Published Oct 21, 2025 • 68