See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis Paper • 2602.20951 • Published 2 days ago • 12
Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage Paper • 2412.15484 • Published Dec 20, 2024 • 15