Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality Paper • 2410.04780 • Published Oct 7, 2024 • 1
VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images Paper • 2604.09531 • Published about 1 month ago • 8 • 2
VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images Paper • 2604.09531 • Published about 1 month ago • 8
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published Feb 2 • 140
Running on CPU Upgrade Agents 1.65k Omni Image Editor 🖼 1.65k Image edit, text to image, image upscale, remove watermark
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published Nov 25, 2025 • 33
UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios Paper • 2511.18050 • Published Nov 22, 2025 • 38