microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated Dec 10, 2025 • 283k • 1.58k
ibm-granite/granite-vision-3.2-2b Image-Text-to-Text • 3B • Updated Jun 12, 2025 • 2.79k • 123
nomic-ai/colnomic-embed-multimodal-7b Visual Document Retrieval • Updated Apr 15, 2025 • 6.33k • 102
Running 203 Vidore Leaderboard 🥇 203 Explore and compare visual document retrieval benchmark results
nomic-ai/nomic-embed-multimodal-3b Visual Document Retrieval • Updated Apr 15, 2025 • 3.22k • 29
moondream/moondream-2b-2025-04-14-4bit Image-Text-to-Text • 1B • Updated May 22, 2025 • 9.3k • 66
Alibaba-NLP/gme-Qwen2-VL-2B-Instruct Sentence Similarity • 2B • Updated Jun 9, 2025 • 20.3k • 133