OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448 Video-Text-to-Text • 2B • Updated Mar 16, 2025 • 789 • 26
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 193k • 1.56k
docling-project/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 22.6k • 1.61k