microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 439k • 1.6k
Runtime error Agents Featured 2.02k Chat With Janus-Pro-7B 🌍 2.02k A unified multimodal understanding and generation model.
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Text Generation • 2B • Updated Feb 24, 2025 • 582k • • 1.51k
Running on CPU Upgrade Agents 1.01k Open VLM Leaderboard 🌎 1.01k VLMEvalKit Evaluation Results Collection