| | --- |
| | license: mit |
| | language: |
| | - multilingual |
| | - en |
| | - ru |
| | tags: |
| | - whisper |
| | - gguf |
| | - quantized |
| | - speech-recognition |
| | - rust |
| | - candle |
| | base_model: |
| | - openai/whisper-tiny |
| | pipeline_tag: automatic-speech-recognition |
| | --- |
| | |
| | # WHISPER-TINY - GGUF Quantized Models |
| |
|
| | Quantized versions of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) in GGUF format. |
| |
|
| | ## Directory Structure |
| |
|
| | ``` |
| | tiny/ |
| | βββ whisper-tiny-q*.gguf # Candle-compatible GGUF models (root) |
| | βββ model-tiny-q80.gguf # Candle-compatible legacy naming (q8_0 format) |
| | βββ config-tiny.json # Model configuration for Candle |
| | βββ tokenizer-tiny.json # Tokenizer for Candle |
| | βββ whisper.cpp/ # whisper.cpp-compatible models |
| | βββ whisper-tiny-q*.gguf |
| | |
| | ``` |
| |
|
| | ### Format Compatibility |
| |
|
| | - **Root directory** (`whisper-tiny-*.gguf`): Use with **Candle** (Rust ML framework) |
| | - Tensor names include `model.` prefix (e.g., `model.encoder.conv1.weight`) |
| | - Requires `config-tiny.json` and `tokenizer-tiny.json` |
| | |
| | - **whisper.cpp/** directory: Use with **whisper.cpp** (C++ implementation) |
| | - Tensor names without `model.` prefix (e.g., `encoder.conv1.weight`) |
| | - Compatible with whisper.cpp CLI tools |
| | - Both directories contain `.gguf` files, not `.bin` files |
| |
|
| | ## Available Formats |
| |
|
| | | Format | Quality | Use Case | |
| | |--------| ---------|----------| |
| | | q2_k | Smallest | Extreme compression | |
| | | q3_k | Small | Mobile devices | |
| | | q4_0 | Good | Legacy compatibility | |
| | | q4_k | Good | **Recommended for production** | |
| | | q4_1 | Good+ | Legacy with bias | |
| | | q5_0 | Very Good | Legacy compatibility | |
| | | q5_k | Very Good | High quality | |
| | | q5_1 | Very Good+ | Legacy with bias | |
| | | q6_k | Excellent | Near-lossless | |
| | | q8_0 | Excellent | Minimal loss, benchmarking | |
| |
|
| | ## Usage |
| |
|
| | ### With Candle (Rust) |
| |
|
| | **Command line example:** |
| | ```bash |
| | # Run Candle Whisper with local quantized model |
| | cargo run --example whisper --release -- \ |
| | --features symphonia \ |
| | --quantized \ |
| | --model tiny \ |
| | --model-id oxide-lab/whisper-tiny-GGUF |
| | ``` |
| |
|
| | ### With whisper.cpp (C++) |
| |
|
| | ```bash |
| | # Use models from whisper.cpp/ subdirectory |
| | ./whisper.cpp/build/bin/whisper-cli \ |
| | --model models/openai/tiny/whisper.cpp/whisper-tiny-q4_k.gguf \ |
| | --file audio.wav |
| | ``` |
| |
|
| | ### Recommended Format |
| |
|
| | For most use cases, we recommend **q4_k** format as it provides the best balance of: |
| | - Size reduction (~65% smaller) |
| | - Quality (minimal degradation) |
| | - Speed (faster inference than higher quantizations) |
| | |
| | ## Quantization Details |
| | |
| | - **Source Model**: [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) |
| | - **Quantization Methods**: |
| | - **Candle GGUF** (root directory): Python-based quantization. Directly PyTorch β GGUF |
| | - Adds `model.` prefix to tensor names for Candle compatibility |
| | - **whisper.cpp GGML** (whisper.cpp/ subdirectory): whisper-quantize tool |
| | - Uses original tensor names without prefix |
| | - **Format**: GGUF (GGML Universal Format) for both directories |
| | - **Total Formats**: 10 quantization levels (q2_k through q8_0) |
| | |
| | ## License |
| | |
| | Same as the original Whisper model (MIT License). |
| | |
| | ## Citation |
| | |
| | ```bibtex |
| | @misc{radford2022whisper, |
| | doi = {10.48550/ARXIV.2212.04356}, |
| | url = {https://arxiv.org/abs/2212.04356}, |
| | author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya}, |
| | title = {Robust Speech Recognition via Large-Scale Weak Supervision}, |
| | publisher = {arXiv}, |
| | year = {2022}, |
| | copyright = {arXiv.org perpetual, non-exclusive license} |
| | } |
| | ``` |