Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
baojian1024 's Collections
Video
Audio
Image
OCR
Comfyui
LTX-2.3
3D models

Audio

updated 1 day ago
Upvote
-

  • microsoft/VibeVoice-ASR

    Automatic Speech Recognition • 9B • Updated Jan 27 • 448k • 1.16k

  • CohereLabs/cohere-transcribe-03-2026

    Automatic Speech Recognition • 2B • Updated 3 days ago • 309k • 968

  • JiongzeYu/SparkVSR

    Updated Apr 4 • 719 • 58

  • smthem/SparkVSR-GGUF

    6B • Updated Mar 25 • 65 • 4

  • microsoft/VibeVoice-1.5B

    Text-to-Speech • 3B • Updated Jan 22 • 61.2k • 2.39k

  • microsoft/VibeVoice-Realtime-0.5B

    Text-to-Speech • 1B • Updated Dec 12, 2025 • 829k • 1.23k

  • meituan-longcat/LongCat-AudioDiT-3.5B

    4B • Updated Apr 3 • 628 • 73

  • openbmb/VoxCPM2

    Text-to-Speech • 2B • Updated Apr 16 • 234k • 1.36k

  • k2-fsa/OmniVoice

    Text-to-Speech • 0.6B • Updated 25 days ago • 2.57M • 962

  • YJX-Xiaomi/ControlFoley

    Text-to-Audio • Updated 19 days ago • 69 • 12

  • Xanthius/Ace-Step-1.5-XL-Concept-Sliders

    Updated 22 days ago • 15

  • Supertone/supertonic-3

    Text-to-Speech • Updated 14 days ago • 57.6k • 768
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs