Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
CIMAI 's Collections
Document Understanding
VL Embedding Models
VL Embedding (multi-vec) Models
VL Instruct Models
VL Reasoning Models
VL Reranker Models
Text Embedding Models
Text Instruct Edge Models
Text Instruct Models
Text Reasoning Models
Text Reranking Models
Speech-to-Text Models
Coding Models

Speech-to-Text Models

updated Jan 8

https://huggingface.co/spaces/hf-audio/open_asr_leaderboard

Upvote
-

  • mistralai/Voxtral-Mini-3B-2507

    5B • Updated Jul 28, 2025 • 453k • 620

    Note See benchmark scores here: https://mistral.ai/news/voxtral


  • nvidia/canary-1b-flash

    Automatic Speech Recognition • 0.8B • Updated Dec 3, 2025 • 1.5k • 264

    Note CC BY 4.0 License: 1) credit creator, 2) add link to license, 3) indicate if you made changes


  • nvidia/canary-180m-flash

    Automatic Speech Recognition • Updated Mar 18, 2025 • 1.43k • 91

    Note CC BY 4.0 License: 1) credit creator, 2) add link to license, 3) indicate if you made changes


  • nvidia/canary-qwen-2.5b

    Automatic Speech Recognition • 3B • Updated Dec 15, 2025 • 143k • 370

    Note CC BY 4.0 License: 1) credit creator, 2) add link to license, 3) indicate if you made changes

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs