| license: mit | |
| tags: | |
| - mlx | |
| - 4bit | |
| - whisper | |
| - stt | |
| - metalrt | |
| - apple-silicon | |
| # Whisper Medium — MLX 4-bit Quantized | |
| Custom MLX 4-bit quantization of [OpenAI Whisper Medium](https://huggingface.co/openai/whisper-medium) optimized for [MetalRT](https://github.com/RunanywhereAI/metalrt-binaries) GPU inference on Apple Silicon. | |
| ## Usage | |
| Used by [RCLI](https://github.com/RunanywhereAI/RCLI) with the MetalRT engine for speech-to-text: | |
| ```bash | |
| rcli setup # select MetalRT or Both engines | |
| ``` | |
| > **Note**: Whisper Medium is in GPU beta. Whisper Tiny is recommended for production use. | |
| ## Performance (Apple M3 Max) | |
| | Metric | Value | | |
| |--------|-------| | |
| | Latency (1.2s audio) | 233 ms | | |
| | RTF | 0.19x | | |
| | Quantization | MLX 4-bit | | |
| ## License | |
| Model weights: [MIT](https://github.com/openai/whisper/blob/main/LICENSE) (OpenAI) | |
| MetalRT engine: [Proprietary](https://github.com/RunanywhereAI/metalrt-binaries/blob/main/LICENSE) (RunAnywhere, Inc.) | |
| ## Contact | |
| founder@runanywhere.ai | https://runanywhere.ai | |