Cohere Transcribe 03-2026 Β· OpenASR

Dedicated 2B ASR for 14-language transcription

License Format Runtime Base model

Native speech-to-text in the OpenASR runtime β€” engineered for peak performance on CPU & GPU, no Python at inference time.


✨ Highlights

  • πŸŽ™οΈ Dedicated ASR β€” audio-in, text-out model built specifically for transcription
  • 🌍 14 languages β€” covers English, major European languages, Arabic, Chinese, Japanese, Korean, and Vietnamese
  • 🧱 Conformer encoder-decoder β€” large acoustic encoder with a lightweight Transformer decoder
  • πŸ¦€ Native in OpenASR β€” .oasr packs run with no Python at inference, engineered for peak performance on CPU & GPU

πŸš€ Quickstart

# 1. Install the OpenASR CLI  Β·  https://openasr.org
# 2. Pull a build (pick a quant β€” see the table below)
openasr pull cohere-transcribe-03-2026:q8

# 3. Transcribe
openasr transcribe audio.wav --model cohere-transcribe-03-2026

All builds for this model:

openasr pull cohere-transcribe-03-2026:fp16
openasr pull cohere-transcribe-03-2026:q8
openasr pull cohere-transcribe-03-2026:q4

πŸ“¦ Available builds

Quant File (.oasr) Size RAM peak RTF Β· M1 CPU RTF Β· M1 GPU JFK Ξ”WER vs fp16
fp16 cohere-transcribe-03-2026-fp16.oasr 4.14 GB 5.12 GB 0.26Γ— 0.11Γ— 0.0%
q8_0 cohere-transcribe-03-2026-q8_0.oasr 2.42 GB 3.42 GB 0.32Γ— 0.11Γ— 0.0%
q4_k cohere-transcribe-03-2026-q4_k.oasr 1.51 GB 2.49 GB 0.25Γ— 0.10Γ— 0.0%

RTF = real-time factor on the fixed 11s JFK clip (lower is faster); RAM peak measured per pack in an isolated subprocess. JFK Ξ”WER compares each quantized build's JFK transcript to this model's fp16 JFK transcript, so it measures quantization drift rather than absolute recognition accuracy. q8_0 is the recommended default β€” near-reference quality at a fraction of the footprint.

🧠 About Cohere Transcribe 03-2026

Cohere Transcribe 03-2026 is Cohere and Cohere Labs' open release of a 2B-parameter automatic speech recognition model. It is a dedicated audio-in, text-out architecture with a Conformer-based acoustic encoder and a lightweight Transformer decoder, trained from scratch for transcription. The upstream model card lists support for 14 languages across English, European, APAC, and MENA coverage and reports Apache-2.0 licensing. This OpenASR repo repackages the original CohereLabs/cohere-transcribe-03-2026 weights as .oasr packs that run natively in the OpenASR runtime with no Python at inference time. For most users the q8_0 build is the recommended default; q4_k is for tighter memory budgets and fp16 is for verification or maximum fidelity.

βš™οΈ How these packs were made

Converted from CohereLabs/cohere-transcribe-03-2026 with the OpenASR importer:

openasr model-pack import-cohere-local <src> <out>.oasr \
  --package-id cohere-transcribe-03-2026 --quantization {fp16,q8-0,q4-k}

The .oasr container is GGUF-backed; packs use zero-copy mmap weight binding and graph buffer reuse to keep peak memory low.

βš–οΈ License

These packs inherit the upstream model's license: Apache-2.0 (source). OpenASR packaging retains the upstream copyright and NOTICE; the only modifications are format conversion and quantization.

πŸ™ Acknowledgements

This pack is a redistribution of Cohere Transcribe 03-2026, released by Cohere and Cohere Labs (CohereLabs/cohere-transcribe-03-2026). All credit for the original model, training recipe, and weights belongs to Cohere and Cohere Labs. The upstream Hugging Face model card declares Apache-2.0 licensing; OpenASR only converts the weights into .oasr packages and adds quantized builds for local runtime use.

πŸ”— Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for OpenASR/cohere-transcribe-03-2026

Finetuned
(11)
this model