File size: 1,539 Bytes
99c8b44 d34e1f6 d9c8dbe 99c8b44 d34e1f6 d9c8dbe d34e1f6 d9c8dbe 99c8b44 d9c8dbe d34e1f6 869c6e5 d34e1f6 99c8b44 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ---
license: mit
base_model:
- ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11
pipeline_tag: automatic-speech-recognition
---
## Description
This model is a fine-tuned version of ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 .
ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 is a finetuned version of bangla-speech-processing/BanglaASR on Bangla speech data.
## Environment:
- Python version: 3.12.12
- PyTorch version: 2.8.0+cu126
- Librosa version: 0.10.1
- NumPy version: 1.26.4
## Training Parameters:
- BATCH_SIZE = 4
- GRADIENT_ACCUMULATION_STEPS = 4 # Effective batch size = 16
- LEARNING_RATE = 2e-5
- WARMUP_STEPS = 400
- NUM_TRAIN_EPOCHS = 8
- LOGGING_STEPS = 50
## Validation Set Evaluation:
| **Epoch** | **Training Loss** | **Validation Loss** | **WER (%)** | **Normalized Levenshtein Similarity (%)** |
| --------- | ----------------- | ------------------- | ----------- | ----------------------------------------- |
| 0 | 2.3479 | 1.59398 | 26.519 | 83.03 |
| 2 | 1.5380 | 1.50034 | 18.011 | 87.15 |
| 4 | 1.4665 | 1.47125 | 12.486 | 91.06 |
| 6 | 1.4448 | 1.46236 | 10.607 | 91.97 |
| 7 | 1.4419 | 1.46210 | 10.441 | 92.12 |
|