--- license: mit base_model: - ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 pipeline_tag: automatic-speech-recognition --- ## Description This model is a fine-tuned version of ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 . ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 is a finetuned version of bangla-speech-processing/BanglaASR on Bangla speech data. ## Environment: - Python version: 3.12.12 - PyTorch version: 2.8.0+cu126 - Librosa version: 0.10.1 - NumPy version: 1.26.4 ## Training Parameters: - BATCH_SIZE = 4 - GRADIENT_ACCUMULATION_STEPS = 4 # Effective batch size = 16 - LEARNING_RATE = 2e-5 - WARMUP_STEPS = 400 - NUM_TRAIN_EPOCHS = 8 - LOGGING_STEPS = 50 ## Validation Set Evaluation: | **Epoch** | **Training Loss** | **Validation Loss** | **WER (%)** | **Normalized Levenshtein Similarity (%)** | | --------- | ----------------- | ------------------- | ----------- | ----------------------------------------- | | 0 | 2.3479 | 1.59398 | 26.519 | 83.03 | | 2 | 1.5380 | 1.50034 | 18.011 | 87.15 | | 4 | 1.4665 | 1.47125 | 12.486 | 91.06 | | 6 | 1.4448 | 1.46236 | 10.607 | 91.97 | | 7 | 1.4419 | 1.46210 | 10.441 | 92.12 |