metadata
language:
- en
library_name: mlx
tags:
- asr
- speech-recognition
- apple-silicon
- mlx
- parakeet
- tdt
- sonic-speech
license: cc-by-4.0
base_model: nvidia/parakeet-tdt-0.6b-v2
pipeline_tag: automatic-speech-recognition
Parakeet TDT 0.6B V2 (MLX, BF16)
NVIDIA Parakeet-TDT 0.6B V2 converted to MLX SafeTensors format for Apple Silicon inference. This is the reference BF16 checkpoint — see quantized variants for reduced memory:
sonic-speech/parakeet-tdt-0.6b-v2-int8— Encoder INT8 / Decoder BF16 (recommended)
Performance (M3 Max, 64GB)
| Metric | Value |
|---|---|
| WER (LibriSpeech test-clean) | 1.67% |
| RTFx | 73x realtime |
| Peak memory | ~3GB |
| Parameters | 627M |
| Format | BF16 SafeTensors |
Usage
from parakeet import from_pretrained
model = from_pretrained("sonic-speech/parakeet-tdt-0.6b-v2")
result = model.transcribe("audio.wav")
print(result.text)
Origin
Weights converted from nvidia/parakeet-tdt-0.6b-v2
via the mlx-community conversion pipeline.
Hosted by Sonic Speech for the Sonic voice AI project.
License
CC-BY-4.0 (following NVIDIA's original license)