🎀 Whisper Small Fine-tuned Model (0dB Noise) => High Noise

This model is a fine-tuned version of openai/whisper-small for speech recognition under noisy (10dB) conditions.


πŸ“Œ Model Details

  • Base Model: openai/whisper-small
  • Noise Condition: 0dB
  • Epochs: 10
  • Total Steps: 790
  • Best Checkpoint: checkpoint-700

πŸ† Best Evaluation Results

The best model was selected at step 700:

  • WER (Word Error Rate): 77.7023
  • CER (Character Error Rate): 34.8762
  • Eval Loss: 0.5518

πŸ“Š Training Summary

Metric Value
Final Training Loss ~0.0673
Best Step 700
Total Steps 790
Epochs 10

πŸ“‰ Evaluation History (Key Steps)

Step WER CER Eval Loss
100 93.33 47.17 0.7525
200 83.26 37.40 0.4301
300 80.05 34.88 0.3891
400 78.75 34.78 0.4148
500 78.13 33.95 0.4685
600 78.69 34.83 0.5106
700 77.70 34.88 0.5518

βš™οΈ Training Hyperparameters

  • Learning Rate: 1e-5
  • Batch Size: 32
  • Optimizer: AdamW (fused)
  • LR Scheduler: Linear
  • Warmup Steps: 100
  • Epochs: 10
  • Mixed Precision: AMP
  • Gradient Accumulation: 2

🎯 Intended Use

  • Speech-to-text transcription
  • Noisy audio (0dB) ASR experiments (HIGH NOISE )
  • Research / academic use

⚠️ Limitations

  • Performance decreases on very noisy / unseen domains
  • Not fully optimized for production use
  • Dataset diversity affects generalization

🧠 Notes

  • Best model selected from checkpoint-700
  • Training continued till 790 steps but best performance was earlier
  • Metrics may vary depending on dataset split
Downloads last month
222
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results