whisper-tiny-GGUF / README.md
FerrisMind's picture
Update README.md
94468a6 verified
---
license: mit
language:
- multilingual
- en
- ru
tags:
- whisper
- gguf
- quantized
- speech-recognition
- rust
- candle
base_model:
- openai/whisper-tiny
pipeline_tag: automatic-speech-recognition
---
# WHISPER-TINY - GGUF Quantized Models
Quantized versions of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) in GGUF format.
## Directory Structure
```
tiny/
β”œβ”€β”€ whisper-tiny-q*.gguf # Candle-compatible GGUF models (root)
β”œβ”€β”€ model-tiny-q80.gguf # Candle-compatible legacy naming (q8_0 format)
β”œβ”€β”€ config-tiny.json # Model configuration for Candle
β”œβ”€β”€ tokenizer-tiny.json # Tokenizer for Candle
└── whisper.cpp/ # whisper.cpp-compatible models
└── whisper-tiny-q*.gguf
```
### Format Compatibility
- **Root directory** (`whisper-tiny-*.gguf`): Use with **Candle** (Rust ML framework)
- Tensor names include `model.` prefix (e.g., `model.encoder.conv1.weight`)
- Requires `config-tiny.json` and `tokenizer-tiny.json`
- **whisper.cpp/** directory: Use with **whisper.cpp** (C++ implementation)
- Tensor names without `model.` prefix (e.g., `encoder.conv1.weight`)
- Compatible with whisper.cpp CLI tools
- Both directories contain `.gguf` files, not `.bin` files
## Available Formats
| Format | Quality | Use Case |
|--------| ---------|----------|
| q2_k | Smallest | Extreme compression |
| q3_k | Small | Mobile devices |
| q4_0 | Good | Legacy compatibility |
| q4_k | Good | **Recommended for production** |
| q4_1 | Good+ | Legacy with bias |
| q5_0 | Very Good | Legacy compatibility |
| q5_k | Very Good | High quality |
| q5_1 | Very Good+ | Legacy with bias |
| q6_k | Excellent | Near-lossless |
| q8_0 | Excellent | Minimal loss, benchmarking |
## Usage
### With Candle (Rust)
**Command line example:**
```bash
# Run Candle Whisper with local quantized model
cargo run --example whisper --release -- \
--features symphonia \
--quantized \
--model tiny \
--model-id oxide-lab/whisper-tiny-GGUF
```
### With whisper.cpp (C++)
```bash
# Use models from whisper.cpp/ subdirectory
./whisper.cpp/build/bin/whisper-cli \
--model models/openai/tiny/whisper.cpp/whisper-tiny-q4_k.gguf \
--file audio.wav
```
### Recommended Format
For most use cases, we recommend **q4_k** format as it provides the best balance of:
- Size reduction (~65% smaller)
- Quality (minimal degradation)
- Speed (faster inference than higher quantizations)
## Quantization Details
- **Source Model**: [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)
- **Quantization Methods**:
- **Candle GGUF** (root directory): Python-based quantization. Directly PyTorch β†’ GGUF
- Adds `model.` prefix to tensor names for Candle compatibility
- **whisper.cpp GGML** (whisper.cpp/ subdirectory): whisper-quantize tool
- Uses original tensor names without prefix
- **Format**: GGUF (GGML Universal Format) for both directories
- **Total Formats**: 10 quantization levels (q2_k through q8_0)
## License
Same as the original Whisper model (MIT License).
## Citation
```bibtex
@misc{radford2022whisper,
doi = {10.48550/ARXIV.2212.04356},
url = {https://arxiv.org/abs/2212.04356},
author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
title = {Robust Speech Recognition via Large-Scale Weak Supervision},
publisher = {arXiv},
year = {2022},
copyright = {arXiv.org perpetual, non-exclusive license}
}
```