MultiRef: Controllable Image Generation with Multiple Visual References Paper • 2508.06905 • Published Aug 9 • 21
Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment Paper • 2411.17188 • Published Nov 26, 2024 • 21
Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination Paper • 2411.12591 • Published Nov 15, 2024
CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom Paper • 2503.01836 • Published Mar 3 • 14
CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale Paper • 2502.16645 • Published Feb 23 • 22
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark Paper • 2402.04788 • Published Feb 7, 2024