Robust Speech Recognition via Large-Scale Weak Supervision
Paper
• 2212.04356 • Published
• 51
This is the OpenAI Whisper Tiny model converted to MLX format with FP16 precision, optimized for Apple Silicon inference.
| Property | Value |
|---|---|
| Base Model | openai/whisper-tiny |
| Parameters | ~39M |
| Format | MLX SafeTensors (FP16) |
| Model Size | 70.94 MB |
| Sample Rate | 16,000 Hz |
| Audio Layers | 4 |
| Text Layers | 4 |
| Hidden Size | 384 |
| Attention Heads | 6 |
| Vocabulary Size | 51,865 |
This model is optimized for on-device automatic speech recognition (ASR) on Apple Silicon devices (Mac, iPhone, iPad). It is designed for use with the WhisperKit or MLX frameworks.
config.json - Model configurationmodel.safetensors - Model weights in SafeTensors format (FP16)multilingual.tiktoken - Tokenizerimport mlx_whisper
result = mlx_whisper.transcribe(
"audio.mp3",
path_or_hf_repo="aitytech/Whisper-Tiny-MLX-FP16",
)
print(result["text"])
Quantized
Base model
openai/whisper-tiny