Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Giymo11
's Collections
Multimodal (Audio + Visual)
Multimodal (Audio)
Audio Only
Multimodal (Audio)
updated
Jan 8
Upvote
-
Qwen/Qwen3-Omni-30B-A3B-Instruct
Any-to-Any
•
Updated
Sep 22, 2025
•
517k
•
866
Qwen/Qwen2-Audio-7B
Audio-Text-to-Text
•
Updated
Nov 20, 2024
•
11k
•
165
mistralai/Voxtral-Small-24B-2507
Audio-Text-to-Text
•
Updated
Dec 20, 2025
•
50.4k
•
465
mistralai/Voxtral-Mini-3B-2507
Updated
Jul 28, 2025
•
488k
•
628
moonshotai/Kimi-Audio-7B-Instruct
Text-to-Speech
•
Updated
May 29, 2025
•
11.3k
•
388
google/gemma-3n-E4B-it
Image-Text-to-Text
•
Updated
Jul 14, 2025
•
72.7k
•
•
882
nvidia/audio-flamingo-3-hf
Audio-Text-to-Text
•
Updated
Jan 27
•
179k
•
174
Upvote
-
Share collection
View history
Collection guide
Browse collections