meta-llama/Llama-3.2-90B-Vision-Instruct Image-Text-to-Text • 89B • Updated Mar 4, 2025 • 34.7k • • 348
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 114k • • 1.55k
Running on Zero Featured 2.01k Chat With Janus-Pro-7B 🌍 2.01k A unified multimodal understanding and generation model.
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated 22 days ago • 257k • 1.55k