-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper β’ 2310.16045 β’ Published β’ 17 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper β’ 2310.14566 β’ Published β’ 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper β’ 2310.13355 β’ Published β’ 9 -
Conditional Diffusion Distillation
Paper β’ 2310.01407 β’ Published β’ 20
Collections
Discover the best community collections!
Collections trending this week
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper β’ 2310.16045 β’ Published β’ 17 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper β’ 2310.14566 β’ Published β’ 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper β’ 2310.13355 β’ Published β’ 9 -
Conditional Diffusion Distillation
Paper β’ 2310.01407 β’ Published β’ 20