Description
This model is a fine-tuned version of ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 . ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 is a finetuned version of bangla-speech-processing/BanglaASR on Bangla speech data.
Environment:
- Python version: 3.12.12
- PyTorch version: 2.8.0+cu126
- Librosa version: 0.10.1
- NumPy version: 1.26.4
Training Parameters:
- BATCH_SIZE = 4
- GRADIENT_ACCUMULATION_STEPS = 4 # Effective batch size = 16
- LEARNING_RATE = 2e-5
- WARMUP_STEPS = 400
- NUM_TRAIN_EPOCHS = 8
- LOGGING_STEPS = 50
Validation Set Evaluation:
| Epoch | Training Loss | Validation Loss | WER (%) | Normalized Levenshtein Similarity (%) |
|---|---|---|---|---|
| 0 | 2.3479 | 1.59398 | 26.519 | 83.03 |
| 2 | 1.5380 | 1.50034 | 18.011 | 87.15 |
| 4 | 1.4665 | 1.47125 | 12.486 | 91.06 |
| 6 | 1.4448 | 1.46236 | 10.607 | 91.97 |
| 7 | 1.4419 | 1.46210 | 10.441 | 92.12 |
- Downloads last month
- 2
Model tree for Rohan432/Augmented_on_normal
Base model
bangla-speech-processing/BanglaASR