whisper-small-sr

Fine-tuned OpenAI Whisper Small.

Output script: this model is intended to produce Serbian Latin only.

  • WER on Common Voice 24.0 Serbian test: 7.09%

Model description

Training and evaluation data

This model was fine-tuned on a mixture of publicly available Serbian speech corpora, including:

  • Mozilla Common Voice 24.0, evaluated on CV test (sr)
  • FLEURS Serbian
  • ParlaSpeech-RS (subset of the full dataset)
  • Additional Serbian corpora used in the training pipeline

Training procedure

  • Epochs: 8
  • Batch size: 32
  • Optimizer: AdamW
  • LR: 6e-5 with warmup (50 steps) + cosine decay to min_lr = 1e-7
  • Mixed precision: bfloat16
  • SpecAugment: frequency + time masking
  • Sampling: weighted sampling across datasets

Training results

Epoch Train loss CV WER
1 0.331 0.1562
2 0.338 0.1202
3 0.241 0.1062
4 0.187 0.0913
5 0.150 0.0853
6 0.122 0.0745
7 0.106 0.0709

Evaluation Metrics

  • WER (normalized) on Common Voice 24.0 Serbian test: 7.09%
  • Text normalization used for WER:
    • punctuation removed
    • lowercased
    • Cyrillic → Latin conversion
    • numbers converted to words
Downloads last month
233
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for istomin9192/whisper-small-sr

Finetuned
(3284)
this model

Datasets used to train istomin9192/whisper-small-sr

Space using istomin9192/whisper-small-sr 1

Evaluation results