Azerbaijani Whisper Small
Fine-tuned openai/whisper-small for Azerbaijani automatic speech recognition.
Performance
| Model | Params | WER | CER |
|---|---|---|---|
| whisper-small (baseline) | 242M | 52.17% | 14.52% |
| whisper-medium (baseline) | 769M | 34.54% | 9.00% |
| whisper-large-v3 (baseline) | 1543M | 21.00% | 5.51% |
| azerbaijani-whisper-small | 242M | 20.54% | 5.72% |
This model achieves better quality than whisper-large-v3 while being 6x smaller.
Evaluated on FLEURS Azerbaijani test set.
Usage
pip install --upgrade transformers
import torch
import librosa
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import soundfile as sf
# Load model
processor = WhisperProcessor.from_pretrained("LocalDoc/azerbaijani-whisper-small")
model = WhisperForConditionalGeneration.from_pretrained("LocalDoc/azerbaijani-whisper-small")
# Load audio
audio, sr = sf.read("audio.wav")
# Resample to 16kHz if needed (important!)
if sr != 16000:
audio = librosa.resample(audio, orig_sr=sr, target_sr=16000)
# Convert stereo to mono if needed
if len(audio.shape) > 1:
audio = audio.mean(axis=1)
# Transcribe
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")
forced_ids = processor.get_decoder_prompt_ids(language="az", task="transcribe")
with torch.no_grad():
ids = model.generate(inputs.input_features, forced_decoder_ids=forced_ids)
text = processor.batch_decode(ids, skip_special_tokens=True)[0]
print(text)
Note: Audio must be 16kHz mono. If your audio has a different sample rate, use
librosa.resample()as shown above. Passing audio without resampling will produce incorrect results.
Requirements
pip install transformers torch soundfile librosa
Benchmark Details
All models evaluated on FLEURS Azerbaijani test split (921 samples) with the same normalization (lowercase, no punctuation).
| Model | Params | WER | CER | RTF (GPU) |
|---|---|---|---|---|
| whisper-tiny | 38M | 104.48% | 53.93% | 0.033 |
| whisper-base | 73M | 82.63% | 30.35% | 0.032 |
| whisper-small | 242M | 52.17% | 14.52% | 0.053 |
| whisper-medium | 769M | 34.54% | 9.00% | 0.097 |
| whisper-large-v3 | 1543M | 21.00% | 5.51% | 0.129 |
| whisper-large-v3-turbo | 809M | 22.99% | 6.55% | 0.024 |
| azerbaijani-whisper-small | 242M | 20.54% | 5.72% | ~0.05 |
License
Apache 2.0
- Downloads last month
- 19
Model tree for LocalDoc/azerbaijani-whisper-small
Base model
openai/whisper-smallDatasets used to train LocalDoc/azerbaijani-whisper-small
Evaluation results
- WER on FLEURS Azerbaijanitest set self-reported20.540
- CER on FLEURS Azerbaijanitest set self-reported5.720