Description

This model is a fine-tuned version of ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 . ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 is a finetuned version of bangla-speech-processing/BanglaASR on Bangla speech data.

Environment:

  • Python version: 3.12.12
  • PyTorch version: 2.8.0+cu126
  • Librosa version: 0.10.1
  • NumPy version: 1.26.4

Training Parameters:

  • BATCH_SIZE = 4
  • GRADIENT_ACCUMULATION_STEPS = 4 # Effective batch size = 16
  • LEARNING_RATE = 2e-5
  • WARMUP_STEPS = 400
  • NUM_TRAIN_EPOCHS = 8
  • LOGGING_STEPS = 50

Validation Set Evaluation:

Epoch Training Loss Validation Loss WER (%) Normalized Levenshtein Similarity (%)
0 2.3479 1.59398 26.519 83.03
2 1.5380 1.50034 18.011 87.15
4 1.4665 1.47125 12.486 91.06
6 1.4448 1.46236 10.607 91.97
7 1.4419 1.46210 10.441 92.12
Downloads last month
2
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Rohan432/Augmented_on_normal