leduclinh's picture
feat: add model files
dd32b84 verified
---
license: apache-2.0
library_name: mlx
tags:
- mlx
- whisper
- speech-recognition
- automatic-speech-recognition
- fp16
- apple-silicon
- ios
- coreml
language:
- en
- zh
- de
- es
- ru
- ko
- fr
- ja
- pt
- tr
- pl
- ca
- nl
- ar
- sv
- it
- id
- hi
- fi
- vi
- he
- uk
- el
- ms
- cs
- ro
- da
- hu
- ta
- "no"
- th
- ur
- hr
- bg
- lt
- la
- mi
- ml
- cy
- sk
- te
- fa
- lv
- bn
- sr
- az
- sl
- kn
- et
- mk
- br
- eu
- is
- hy
- ne
- mn
- bs
- kk
- sq
- sw
- gl
- mr
- pa
- si
- km
- sn
- yo
- so
- af
- oc
- ka
- be
- tg
- sd
- gu
- am
- yi
- lo
- uz
- fo
- ht
- ps
- tk
- nn
- mt
- sa
- lb
- my
- bo
- tl
- mg
- as
- tt
- haw
- ln
- ha
- ba
- jw
- su
- yue
pipeline_tag: automatic-speech-recognition
base_model: openai/whisper-medium
---
# Whisper Medium - MLX FP16
This is the [OpenAI Whisper Medium](https://huggingface.co/openai/whisper-medium) model converted to [MLX](https://github.com/ml-explore/mlx) format with FP16 precision, optimized for Apple Silicon inference.
## Model Details
| Property | Value |
|---|---|
| Base Model | openai/whisper-medium |
| Parameters | ~769M |
| Format | MLX SafeTensors (FP16) |
| Model Size | 1,454.10 MB |
| Sample Rate | 16,000 Hz |
| Audio Layers | 24 |
| Text Layers | 24 |
| Hidden Size | 1024 |
| Attention Heads | 16 |
| Vocabulary Size | 51,865 |
## Intended Use
This model is optimized for on-device automatic speech recognition (ASR) on Apple Silicon devices (Mac, iPhone, iPad). It is designed for use with the [WhisperKit](https://github.com/argmaxinc/WhisperKit) or [MLX](https://github.com/ml-explore/mlx) frameworks.
## Files
- `config.json` - Model configuration
- `model.safetensors` - Model weights in SafeTensors format (FP16)
- `multilingual.tiktoken` - Tokenizer
## Usage
```python
import mlx_whisper
result = mlx_whisper.transcribe(
"audio.mp3",
path_or_hf_repo="aitytech/Whisper-Medium-MLX-FP16",
)
print(result["text"])
```
## Original Model
- **Paper:** [Robust Speech Recognition via Large-Scale Weak Supervision](https://arxiv.org/abs/2212.04356)
- **Authors:** OpenAI
- **License:** Apache-2.0