Parakeet TDT 0.6B v3 - MLX FP16

This is the NVIDIA Parakeet TDT 0.6B model converted to MLX format with FP16 precision, optimized for Apple Silicon inference.

Parakeet TDT is an ASR model based on the Token-and-Duration Transducer (TDT) architecture, trained on large-scale speech data. The v3 variant adds support for 25 European languages beyond the original English-only v2.

Model Details

Property Value
Base Model nvidia/parakeet-tdt-0.6b-v2
Parameters ~600M
Format MLX SafeTensors (FP16)
Model Size 1,196.08 MB
Sample Rate 16,000 Hz
Architecture FastConformer + TDT
Encoder Hidden 1024
Predictor Hidden 640
Joint Hidden 640
TDT Durations [0, 1, 2, 3, 4]
Tokenizer BPE
Language 25 European languages (BG, CS, DA, DE, EL, EN, ES, ET, FI, FR, HR, HU, IT, LT, LV, MT, NL, PL, PT, RO, RU, SK, SL, SV, UK)

Intended Use

This model is optimized for on-device automatic speech recognition (ASR) on Apple Silicon devices (Mac, iPhone, iPad). It supports 25 European languages: BG, CS, DA, DE, EL, EN, ES, ET, FI, FR, HR, HU, IT, LT, LV, MT, NL, PL, PT, RO, RU, SK, SL, SV, UK.

Files

  • config.json - Model configuration
  • model.safetensors - Model weights in SafeTensors format (FP16)
  • tokenizer.model - SentencePiece tokenizer model
  • tokenizer.vocab - Tokenizer vocabulary
  • vocab.txt - Vocabulary text file

Original Model

Downloads last month
-
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aitytech/Parakeet-TDT-0.6B-v3-MLX-FP16

Finetuned
(23)
this model

Paper for aitytech/Parakeet-TDT-0.6B-v3-MLX-FP16