microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 541k • 1.61k
meta-llama/Llama-3.2-90B-Vision-Instruct Image-Text-to-Text • 89B • Updated Mar 4, 2025 • 12.5k • 358