aitytech
/

Whisper-Tiny-MLX-FP16

Automatic Speech Recognition

speech-recognition

Model card Files Files and versions

Whisper-Tiny-MLX-FP16 / README.md

leduclinh's picture

feat: add model files

be705fc verified about 12 hours ago

|

history blame contribute delete

2.29 kB

	---
	license: apache-2.0
	library_name: mlx
	tags:
	- mlx
	- whisper
	- speech-recognition
	- automatic-speech-recognition
	- fp16
	- apple-silicon
	- ios
	- coreml
	language:
	- en
	- zh
	- de
	- es
	- ru
	- ko
	- fr
	- ja
	- pt
	- tr
	- pl
	- ca
	- nl
	- ar
	- sv
	- it
	- id
	- hi
	- fi
	- vi
	- he
	- uk
	- el
	- ms
	- cs
	- ro
	- da
	- hu
	- ta
	- "no"
	- th
	- ur
	- hr
	- bg
	- lt
	- la
	- mi
	- ml
	- cy
	- sk
	- te
	- fa
	- lv
	- bn
	- sr
	- az
	- sl
	- kn
	- et
	- mk
	- br
	- eu
	- is
	- hy
	- ne
	- mn
	- bs
	- kk
	- sq
	- sw
	- gl
	- mr
	- pa
	- si
	- km
	- sn
	- yo
	- so
	- af
	- oc
	- ka
	- be
	- tg
	- sd
	- gu
	- am
	- yi
	- lo
	- uz
	- fo
	- ht
	- ps
	- tk
	- nn
	- mt
	- sa
	- lb
	- my
	- bo
	- tl
	- mg
	- as
	- tt
	- haw
	- ln
	- ha
	- ba
	- jw
	- su
	- yue
	pipeline_tag: automatic-speech-recognition
	base_model: openai/whisper-tiny
	---

	# Whisper Tiny - MLX FP16

	This is the [OpenAI Whisper Tiny](https://huggingface.co/openai/whisper-tiny) model converted to [MLX](https://github.com/ml-explore/mlx) format with FP16 precision, optimized for Apple Silicon inference.

	## Model Details

	\| Property \| Value \|
	\|---\|---\|
	\| Base Model \| openai/whisper-tiny \|
	\| Parameters \| ~39M \|
	\| Format \| MLX SafeTensors (FP16) \|
	\| Model Size \| 70.94 MB \|
	\| Sample Rate \| 16,000 Hz \|
	\| Audio Layers \| 4 \|
	\| Text Layers \| 4 \|
	\| Hidden Size \| 384 \|
	\| Attention Heads \| 6 \|
	\| Vocabulary Size \| 51,865 \|

	## Intended Use

	This model is optimized for on-device automatic speech recognition (ASR) on Apple Silicon devices (Mac, iPhone, iPad). It is designed for use with the [WhisperKit](https://github.com/argmaxinc/WhisperKit) or [MLX](https://github.com/ml-explore/mlx) frameworks.

	## Files

	- `config.json` - Model configuration
	- `model.safetensors` - Model weights in SafeTensors format (FP16)
	- `multilingual.tiktoken` - Tokenizer

	## Usage

	```python
	import mlx_whisper

	result = mlx_whisper.transcribe(
	"audio.mp3",
	path_or_hf_repo="aitytech/Whisper-Tiny-MLX-FP16",
	)
	print(result["text"])
	```

	## Original Model

	- Paper: [Robust Speech Recognition via Large-Scale Weak Supervision](https://arxiv.org/abs/2212.04356)
	- Authors: OpenAI
	- License: Apache-2.0