π€ Whisper Small Fine-tuned Model (0dB Noise) => High Noise
This model is a fine-tuned version of openai/whisper-small for speech recognition under noisy (10dB) conditions.
π Model Details
- Base Model:
openai/whisper-small - Noise Condition: 0dB
- Epochs: 10
- Total Steps: 790
- Best Checkpoint:
checkpoint-700
π Best Evaluation Results
The best model was selected at step 700:
- WER (Word Error Rate): 77.7023
- CER (Character Error Rate): 34.8762
- Eval Loss: 0.5518
π Training Summary
| Metric | Value |
|---|---|
| Final Training Loss | ~0.0673 |
| Best Step | 700 |
| Total Steps | 790 |
| Epochs | 10 |
π Evaluation History (Key Steps)
| Step | WER | CER | Eval Loss |
|---|---|---|---|
| 100 | 93.33 | 47.17 | 0.7525 |
| 200 | 83.26 | 37.40 | 0.4301 |
| 300 | 80.05 | 34.88 | 0.3891 |
| 400 | 78.75 | 34.78 | 0.4148 |
| 500 | 78.13 | 33.95 | 0.4685 |
| 600 | 78.69 | 34.83 | 0.5106 |
| 700 | 77.70 | 34.88 | 0.5518 |
βοΈ Training Hyperparameters
- Learning Rate: 1e-5
- Batch Size: 32
- Optimizer: AdamW (fused)
- LR Scheduler: Linear
- Warmup Steps: 100
- Epochs: 10
- Mixed Precision: AMP
- Gradient Accumulation: 2
π― Intended Use
- Speech-to-text transcription
- Noisy audio (0dB) ASR experiments (HIGH NOISE )
- Research / academic use
β οΈ Limitations
- Performance decreases on very noisy / unseen domains
- Not fully optimized for production use
- Dataset diversity affects generalization
π§ Notes
- Best model selected from checkpoint-700
- Training continued till 790 steps but best performance was earlier
- Metrics may vary depending on dataset split
- Downloads last month
- 222
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Evaluation results
- WERself-reported77.702
- CERself-reported34.876