Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
oumo-os
's Collections
Mindone
Cosmic
Vision
Tts
Multimodal input, text gen
Multimodal input, text gen
updated
Jan 29, 2025
Upvote
-
Qwen/Qwen2.5-VL-72B-Instruct
Image-Text-to-Text
•
73B
•
Updated
Jun 6, 2025
•
269k
•
•
598
Qwen/Qwen2.5-VL-7B-Instruct
Image-Text-to-Text
•
8B
•
Updated
Apr 6, 2025
•
5.02M
•
•
1.47k
Upvote
-
Share collection
View history
Collection guide
Browse collections