whisper-small-sr
Fine-tuned OpenAI Whisper Small.
Output script: this model is intended to produce Serbian Latin only.
- WER on Common Voice 24.0 Serbian test: 7.09%
Model description
Training and evaluation data
This model was fine-tuned on a mixture of publicly available Serbian speech corpora, including:
- Mozilla Common Voice 24.0, evaluated on CV test (sr)
- FLEURS Serbian
- ParlaSpeech-RS (subset of the full dataset)
- Additional Serbian corpora used in the training pipeline
Training procedure
- Epochs: 8
- Batch size: 32
- Optimizer: AdamW
- LR: 6e-5 with warmup (50 steps) + cosine decay to min_lr = 1e-7
- Mixed precision: bfloat16
- SpecAugment: frequency + time masking
- Sampling: weighted sampling across datasets
Training results
| Epoch | Train loss | CV WER |
|---|---|---|
| 1 | 0.331 | 0.1562 |
| 2 | 0.338 | 0.1202 |
| 3 | 0.241 | 0.1062 |
| 4 | 0.187 | 0.0913 |
| 5 | 0.150 | 0.0853 |
| 6 | 0.122 | 0.0745 |
| 7 | 0.106 | 0.0709 |
Evaluation Metrics
- WER (normalized) on Common Voice 24.0 Serbian test: 7.09%
- Text normalization used for WER:
- punctuation removed
- lowercased
- Cyrillic → Latin conversion
- numbers converted to words
- Downloads last month
- 233
Model tree for istomin9192/whisper-small-sr
Base model
openai/whisper-smallDatasets used to train istomin9192/whisper-small-sr
Space using istomin9192/whisper-small-sr 1
Evaluation results
- Wer on Common Voice 24.0test set self-reported0.071