Running on Zero Agents Featured 17 Qwen3 VL Video Grounding 🥠 17 Text-guided object tracking, point tracking, reasoning.
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 439k • 1.6k