Whisper
Collection
GGUF version of multilingual Whisper models for whisper.cpp and candle
•
12 items
•
Updated
Quantized versions of openai/whisper-small in GGUF format.
small/
├── whisper-small-q*.gguf # Candle-compatible GGUF models (root)
├── config.json # Model configuration for Candle
├── tokenizer.json # Tokenizer for Candle
└── whisper.cpp/ # whisper.cpp-compatible models
└── whisper-small-q*.gguf
Root directory (whisper-small-*.gguf): Use with Candle (Rust ML framework)
model. prefix (e.g., model.encoder.conv1.weight)config-small.json and tokenizer-small.jsonwhisper.cpp/ directory: Use with whisper.cpp (C++ implementation)
model. prefix (e.g., encoder.conv1.weight).gguf files, not .bin files| Format | Quality | Use Case |
|---|---|---|
| q2_k | Smallest | Extreme compression |
| q3_k | Small | Mobile devices |
| q4_0 | Good | Legacy compatibility |
| q4_k | Good | Recommended for production |
| q4_1 | Good+ | Legacy with bias |
| q5_0 | Very Good | Legacy compatibility |
| q5_k | Very Good | High quality |
| q5_1 | Very Good+ | Legacy with bias |
| q6_k | Excellent | Near-lossless |
| q8_0 | Excellent | Minimal loss, benchmarking |
For this model, you need to modify the example code in candle. To try whisper in candle faster and easier, it's better to use the tiny model → https://huggingface.co/oxide-lab/whisper-tiny-GGUF
Command line example:
# Run Candle Whisper with local quantized model
cargo run --example whisper --release -- \
--features symphonia \
--quantized \
--model small \
--model-id oxide-lab/whisper-small-GGUF \
# Use models from whisper.cpp/ subdirectory
./whisper.cpp/build/bin/whisper-cli \
--model models/openai/small/whisper.cpp/whisper-small-q4_k.gguf \
--file audio.wav
For most use cases, we recommend q4_k format as it provides the best balance of:
model. prefix to tensor names for Candle compatibilitySame as the original Whisper model (MIT License).
@misc{radford2022whisper,
doi = {10.48550/ARXIV.2212.04356},
url = {https://arxiv.org/abs/2212.04356},
author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
title = {Robust Speech Recognition via Large-Scale Weak Supervision},
publisher = {arXiv},
year = {2022},
copyright = {arXiv.org perpetual, non-exclusive license}
}
2-bit
4-bit
5-bit
6-bit
8-bit
Base model
openai/whisper-small