Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
simaai
's Collections
Large Language Models
Vision-Language Models
Speech & Audio Models
Vision-Language Models
updated
3 days ago
Precompiled vision-language models for on-device multimodal tasks.
Upvote
-
simaai/gemma3-siglip448-a16w4
Image-Text-to-Text
•
Updated
Feb 5
•
114
simaai/Qwen3-VL-4B-Instruct-a16w4
Image-Text-to-Text
•
Updated
Jan 12
simaai/LFM2-VL-1.6B-a16w4
Image-Text-to-Text
•
Updated
Jan 12
simaai/LFM2-VL-3B-a16w4
Image-Text-to-Text
•
Updated
Jan 12
simaai/LFM2-VL-450M-a16w4
Image-Text-to-Text
•
Updated
Jan 12
simaai/Qwen3-VL-8B-Instruct-a16w4
Image-Text-to-Text
•
Updated
Jan 12
simaai/Qwen2.5-VL-3B-Instruct-a16w4
Image-Text-to-Text
•
Updated
Jan 26
simaai/llava-1.5-7b-hf-a16w4
Image-Text-to-Text
•
Updated
Dec 19, 2025
•
19
simaai/paligemma-3b-pt-224-a16w8
Image-Text-to-Text
•
Updated
Aug 11, 2025
Upvote
-
Share collection
View history
Collection guide
Browse collections