| | --- |
| | license: mit |
| | base_model: |
| | - ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 |
| | pipeline_tag: automatic-speech-recognition |
| | --- |
| | |
| | ## Description |
| | This model is a fine-tuned version of ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 . |
| | ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 is a finetuned version of bangla-speech-processing/BanglaASR on Bangla speech data. |
| |
|
| | ## Environment: |
| | - Python version: 3.12.12 |
| | - PyTorch version: 2.8.0+cu126 |
| | - Librosa version: 0.10.1 |
| | - NumPy version: 1.26.4 |
| |
|
| | ## Training Parameters: |
| |
|
| | - BATCH_SIZE = 4 |
| | - GRADIENT_ACCUMULATION_STEPS = 4 # Effective batch size = 16 |
| | - LEARNING_RATE = 2e-5 |
| | - WARMUP_STEPS = 400 |
| | - NUM_TRAIN_EPOCHS = 8 |
| | - LOGGING_STEPS = 50 |
| |
|
| | ## Validation Set Evaluation: |
| |
|
| | | **Epoch** | **Training Loss** | **Validation Loss** | **WER (%)** | **Normalized Levenshtein Similarity (%)** | |
| | | --------- | ----------------- | ------------------- | ----------- | ----------------------------------------- | |
| | | 0 | 2.3479 | 1.59398 | 26.519 | 83.03 | |
| | | 2 | 1.5380 | 1.50034 | 18.011 | 87.15 | |
| | | 4 | 1.4665 | 1.47125 | 12.486 | 91.06 | |
| | | 6 | 1.4448 | 1.46236 | 10.607 | 91.97 | |
| | | 7 | 1.4419 | 1.46210 | 10.441 | 92.12 | |
| |
|