whisper_medium_4bit / README.md
kcvmk's picture
Upload folder using huggingface_hub
cab4bdc verified
metadata
license: mit
tags:
  - mlx
  - 4bit
  - whisper
  - stt
  - metalrt
  - apple-silicon

Whisper Medium — MLX 4-bit Quantized

Custom MLX 4-bit quantization of OpenAI Whisper Medium optimized for MetalRT GPU inference on Apple Silicon.

Usage

Used by RCLI with the MetalRT engine for speech-to-text:

rcli setup          # select MetalRT or Both engines

Note: Whisper Medium is in GPU beta. Whisper Tiny is recommended for production use.

Performance (Apple M3 Max)

Metric Value
Latency (1.2s audio) 233 ms
RTF 0.19x
Quantization MLX 4-bit

License

Model weights: MIT (OpenAI) MetalRT engine: Proprietary (RunAnywhere, Inc.)

Contact

founder@runanywhere.ai | https://runanywhere.ai