OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448 Video-Text-to-Text • 2B • Updated Mar 16, 2025 • 397 • 27
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 512k • 1.61k
docling-project/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 26.4k • 1.61k