metadata
license: apache-2.0
library_name: mlx
tags:
- mlx
- whisper
- speech-recognition
- automatic-speech-recognition
- fp16
- apple-silicon
- ios
- coreml
language:
- en
- zh
- de
- es
- ru
- ko
- fr
- ja
- pt
- tr
- pl
- ca
- nl
- ar
- sv
- it
- id
- hi
- fi
- vi
- he
- uk
- el
- ms
- cs
- ro
- da
- hu
- ta
- 'no'
- th
- ur
- hr
- bg
- lt
- la
- mi
- ml
- cy
- sk
- te
- fa
- lv
- bn
- sr
- az
- sl
- kn
- et
- mk
- br
- eu
- is
- hy
- ne
- mn
- bs
- kk
- sq
- sw
- gl
- mr
- pa
- si
- km
- sn
- yo
- so
- af
- oc
- ka
- be
- tg
- sd
- gu
- am
- yi
- lo
- uz
- fo
- ht
- ps
- tk
- nn
- mt
- sa
- lb
- my
- bo
- tl
- mg
- as
- tt
- haw
- ln
- ha
- ba
- jw
- su
- yue
pipeline_tag: automatic-speech-recognition
base_model: openai/whisper-tiny
Whisper Tiny - MLX FP16
This is the OpenAI Whisper Tiny model converted to MLX format with FP16 precision, optimized for Apple Silicon inference.
Model Details
| Property | Value |
|---|---|
| Base Model | openai/whisper-tiny |
| Parameters | ~39M |
| Format | MLX SafeTensors (FP16) |
| Model Size | 70.94 MB |
| Sample Rate | 16,000 Hz |
| Audio Layers | 4 |
| Text Layers | 4 |
| Hidden Size | 384 |
| Attention Heads | 6 |
| Vocabulary Size | 51,865 |
Intended Use
This model is optimized for on-device automatic speech recognition (ASR) on Apple Silicon devices (Mac, iPhone, iPad). It is designed for use with the WhisperKit or MLX frameworks.
Files
config.json- Model configurationmodel.safetensors- Model weights in SafeTensors format (FP16)multilingual.tiktoken- Tokenizer
Usage
import mlx_whisper
result = mlx_whisper.transcribe(
"audio.mp3",
path_or_hf_repo="aitytech/Whisper-Tiny-MLX-FP16",
)
print(result["text"])
Original Model
- Paper: Robust Speech Recognition via Large-Scale Weak Supervision
- Authors: OpenAI
- License: Apache-2.0