ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer Paper • 2603.15478 • Published 1 day ago • 21
Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models Paper • 2603.15557 • Published 1 day ago • 25
Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models Paper • 2603.15557 • Published 1 day ago • 25 • 3
Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models Paper • 2603.15557 • Published 1 day ago • 25
Vid-SME: Membership Inference Attacks against Large Video Understanding Models Paper • 2506.03179 • Published May 29, 2025
Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models Paper • 2603.15557 • Published 1 day ago • 25
In-Video Instructions: Visual Signals as Generative Control Paper • 2511.19401 • Published Nov 24, 2025 • 32
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist Paper • 2511.08521 • Published Nov 11, 2025 • 39
Discrete Diffusion LLM & MLLM Collection An collection of research/models in discrete diffusion large language and multimodal models • 57 items • Updated Jun 17, 2025 • 4
Discrete Diffusion in Large Language and Multimodal Models: A Survey Paper • 2506.13759 • Published Jun 16, 2025 • 43