Whisper Small Hausa - Fine-tuned Speech Recognition

Model Details

This is a fine-tuned version of OpenAI's Whisper Small model, specifically optimized for Hausa speech recognition.

Training Configuration:

  • Base Model: openai/whisper-small
  • Framework: PyTorch with ๐Ÿค— Transformers
  • Max Steps: 10,000
  • Learning Rate: 5e-5
  • Batch Size: 2 (per device)
  • Gradient Accumulation Steps: 2
  • FP16 Training: Enabled
  • Warmup Steps: 500
  • Max Gradient Norm: 1.0

Data Processing:

  • Resampling to 16kHz
  • Noise reduction using spectral gating
  • Streaming dataset with 10% validation split

Training Progress

Step Training Loss Validation Loss
500 1.216000 1.182656
1000 1.016500 0.977233
1500 0.915600 0.856705
2000 0.857600 0.789428
2500 0.803500 0.715454
3000 0.746300 0.663170
3500 0.706800 0.626299
4000 0.681500 0.613087
4500 0.661200 0.561574
5000 0.575900 0.519949
5500 0.614600 0.492101
6000 0.584200 0.461233
6500 0.585500 0.428762
7000 0.506500 0.408919
7500 0.497500 0.384893
8000 0.571700 0.353761
8500 0.472900 0.337668
9000 0.449900 0.325715
9500 0.463600 0.314076
10000 0.414100 0.301878

Final Results:

  • Final Training Loss: 0.414100
  • Final Validation Loss: 0.301878
  • Overall Improvement: 74.5%
  • 74.5% validation loss reduction
  • Optimized for Nigerian Hausa dialects
  • WER: 40.6(common_voice_17_0)
  • CER: 21.4(common_voice_17_0)
Downloads last month
4
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support