Whisper Base - MLX FP16

This is the OpenAI Whisper Base model converted to MLX format with FP16 precision, optimized for Apple Silicon inference.

Model Details

Property	Value
Base Model	openai/whisper-base
Parameters	~74M
Format	MLX SafeTensors (FP16)
Model Size	137.02 MB
Sample Rate	16,000 Hz
Audio Layers	6
Text Layers	6
Hidden Size	512
Attention Heads	8
Vocabulary Size	51,865

Intended Use

This model is optimized for on-device automatic speech recognition (ASR) on Apple Silicon devices (Mac, iPhone, iPad). It is designed for use with the WhisperKit or MLX frameworks.

Files

config.json - Model configuration
model.safetensors - Model weights in SafeTensors format (FP16)
multilingual.tiktoken - Tokenizer

Usage

import mlx_whisper

result = mlx_whisper.transcribe(
    "audio.mp3",
    path_or_hf_repo="aitytech/Whisper-Base-MLX-FP16",
)
print(result["text"])

Original Model

Paper: Robust Speech Recognition via Large-Scale Weak Supervision
Authors: OpenAI
License: Apache-2.0

Downloads last month: 4

Safetensors

Model size

71.8M params

Tensor type

F16

MLX

Hardware compatibility

Quantized

Model tree for aitytech/Whisper-Base-MLX-FP16

Base model

openai/whisper-base

Finetuned

(712)

this model

Paper for aitytech/Whisper-Base-MLX-FP16

Robust Speech Recognition via Large-Scale Weak Supervision

Paper • 2212.04356 • Published Dec 6, 2022 • 54