Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
bezzam
's Collections
VibeVoice
Neural codecs
Omnilingual ASR (1,600+ Languages)
Multimodel audio
Speech recognition datasets
Text-to-speech datasets
DigiCam (CelebA)
DiffuserCam Mirflickr
Multimodel audio
updated
Dec 8, 2025
Upvote
-
facebook/seamless-m4t-v2-large
Automatic Speech Recognition
•
2B
•
Updated
Jan 4, 2024
•
76.2k
•
983
stepfun-ai/Step-Audio-2-mini
Any-to-Any
•
8B
•
Updated
Feb 14
•
1.8k
•
255
bosonai/higgs-audio-v2-generation-3B-base
Text-to-Speech
•
6B
•
Updated
Apr 4
•
92.8k
•
673
Upvote
-
Share collection
View history
Collection guide
Browse collections