microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition
•
6B
•
Updated
•
224k
•
1.56k
Magma-8B model for UI Agents
Chat with images, videos, or PDFs to generate text
OmniGen2: Unified Image Understanding and Generation.
THUDM/GLM-4.1V-9B-Thinking Demo