Whisper Large V3 - MLX Q4 Quantized

4-bit quantized version of OpenAI's Whisper Large V3, optimized for Apple Silicon with MLX. Smallest large-v3 variant.

Model Details

Property Value
Original Model openai/whisper-large-v3
Parameters ~1.55B
Quantization INT4 (Q4)
Size ~900MB
Decoder Layers 32

Half the size of Q8 with minimal accuracy loss. Best choice for memory-constrained devices.

Other Whisper Models

Model Size Quality Link
small ~300MB Good LibraxisAI/whisper-small-mlx-q8
medium ~800MB Better LibraxisAI/whisper-medium-mlx-q8
large-v3 q8 ~1.6GB Best LibraxisAI/whisper-large-v3-mlx-q8
large-v3 q4 (this) ~900MB Best (compressed) -
large-v3-turbo ~900MB Great (fast) LibraxisAI/whisper-large-v3-turbo-mlx-q8

Usage

import mlx_whisper

result = mlx_whisper.transcribe(
    "audio.wav",
    path_or_hf_repo="LibraxisAI/whisper-large-v3-q4"
)
print(result["text"])

Supported Languages

Full multilingual support: English, Polish, German, French, Spanish, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean, Arabic, Hindi, and 90+ additional languages.

Hardware Requirements

  • Apple Silicon Mac (M1/M2/M3/M4)
  • Minimum 8GB RAM

License

MIT - inherited from OpenAI Whisper.


Converted by LibraxisAI using mlx-whisper

Downloads last month
33
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LibraxisAI/whisper-large-v3-q4

Finetuned
(789)
this model