CrisperWhisper Unsloth (MLX, FP16)

TL;DR: Please consider using my FP8 finetune. In my preliminary tests and benchmarks it performs similar to this FP16 variant. However, it's 2x as fast and 2x less memory hungry!

This repo provides the CrisperWhisper model converted to MLX for fast on-device ASR on Apple Silicon.

This model works exceptionally well for scenarios where word-level precision is desired. Instead of grammatically correct sentences, this model is fine-tuned for word-by-word transcriptions - which is exactly what you probably want to use for interviews, or Alexa-like home automation applications.

Huge credit to Laurin from nyra.health and Daniel + Michael from Unsloth for the heavy lifting. Free to use for non-commercial use only!

See Laurin's original paper for more details.

Base model: unsloth/CrisperWhisper (Torch) → converted via mlx-examples/whisper/convert.py. :contentReference[oaicite:0]{index=0}

What’s inside

  • weights.safetensors — MLX FP16 weights
  • config.json — MLX Whisper config

Usage (recommended: auto-download from Hugging Face)

mlx_whisper supports Hugging Face repo IDs in path_or_hf_repo, and will download automatically. :contentReference[oaicite:1]{index=1}

from mlx_whisper import transcribe

out = transcribe(
    "audio.wav",
    path_or_hf_repo="kyr0/crisperwhisper-unsloth-mlx",
)
print(out["text"])

Usage (local path)

If you already have a local MLX folder, point path_or_hf_repo to it:

from mlx_whisper import transcribe

out = transcribe(
    "audio.wav",
    path_or_hf_repo="./mlx_models/crisperwhisper-unsloth-mlx",
)
print(out["text"])

Live Transcription!

Please follow me on Github - I'm working on a local, private live transcription system for Apple Silicon: kyr0's crispr-live-mlx

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kyr0/crisperwhisper-unsloth-mlx

Finetuned
(2)
this model

Paper for kyr0/crisperwhisper-unsloth-mlx