Augmented_on_normal / README.md
Rohan432's picture
Update README.md
869c6e5 verified
---
license: mit
base_model:
- ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11
pipeline_tag: automatic-speech-recognition
---
## Description
This model is a fine-tuned version of ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 .
ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 is a finetuned version of bangla-speech-processing/BanglaASR on Bangla speech data.
## Environment:
- Python version: 3.12.12
- PyTorch version: 2.8.0+cu126
- Librosa version: 0.10.1
- NumPy version: 1.26.4
## Training Parameters:
- BATCH_SIZE = 4
- GRADIENT_ACCUMULATION_STEPS = 4 # Effective batch size = 16
- LEARNING_RATE = 2e-5
- WARMUP_STEPS = 400
- NUM_TRAIN_EPOCHS = 8
- LOGGING_STEPS = 50
## Validation Set Evaluation:
| **Epoch** | **Training Loss** | **Validation Loss** | **WER (%)** | **Normalized Levenshtein Similarity (%)** |
| --------- | ----------------- | ------------------- | ----------- | ----------------------------------------- |
| 0 | 2.3479 | 1.59398 | 26.519 | 83.03 |
| 2 | 1.5380 | 1.50034 | 18.011 | 87.15 |
| 4 | 1.4665 | 1.47125 | 12.486 | 91.06 |
| 6 | 1.4448 | 1.46236 | 10.607 | 91.97 |
| 7 | 1.4419 | 1.46210 | 10.441 | 92.12 |