Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
CIMAI 's Collections
Document Understanding
VL Embedding Models
VL Embedding (multi-vec) Models
VL Instruct Models
VL Reasoning Models
VL Reranker Models
Text Embedding Models
Text Instruct Edge Models
Text Instruct Models
Text Reasoning Models
Text Reranking Models
Speech-to-Text Models
Coding Models

Speech-to-Text Models

updated 10 days ago

https://huggingface.co/spaces/hf-audio/open_asr_leaderboard

Upvote
-

  • mistralai/Voxtral-Mini-4B-Realtime-2602

    Automatic Speech Recognition • 4B • Updated 18 days ago • 802k • 757

  • mistralai/Voxtral-Mini-3B-2507

    Updated Jul 28, 2025 • 477k • 632

    Note See benchmark scores here: https://mistral.ai/news/voxtral


  • nvidia/canary-1b-flash

    Automatic Speech Recognition • Updated Dec 3, 2025 • 81.4k • 268

    Note CC BY 4.0 License: 1) credit creator, 2) add link to license, 3) indicate if you made changes


  • nvidia/canary-qwen-2.5b

    Automatic Speech Recognition • Updated Dec 15, 2025 • 146k • 403

    Note CC BY 4.0 License: 1) credit creator, 2) add link to license, 3) indicate if you made changes


  • nvidia/canary-180m-flash

    Automatic Speech Recognition • Updated Mar 18, 2025 • 1.3k • 97

    Note CC BY 4.0 License: 1) credit creator, 2) add link to license, 3) indicate if you made changes

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs