Rohan432
/

Augmented_on_normal

Automatic Speech Recognition

Model card Files Files and versions

Metrics Training metrics Community

Augmented_on_normal / README.md

Rohan432's picture

Update README.md

869c6e5 verified 3 months ago

|

history blame contribute delete

1.54 kB

	---
	license: mit
	base_model:
	- ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11
	pipeline_tag: automatic-speech-recognition
	---

	## Description
	This model is a fine-tuned version of ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 .
	ishmamzarif/bangla_asr_augmented_bangla-whisper-epoch-11 is a finetuned version of bangla-speech-processing/BanglaASR on Bangla speech data.

	## Environment:
	- Python version: 3.12.12
	- PyTorch version: 2.8.0+cu126
	- Librosa version: 0.10.1
	- NumPy version: 1.26.4

	## Training Parameters:

	- BATCH_SIZE = 4
	- GRADIENT_ACCUMULATION_STEPS = 4 # Effective batch size = 16
	- LEARNING_RATE = 2e-5
	- WARMUP_STEPS = 400
	- NUM_TRAIN_EPOCHS = 8
	- LOGGING_STEPS = 50

	## Validation Set Evaluation:

	\| Epoch \| Training Loss \| Validation Loss \| WER (%) \| Normalized Levenshtein Similarity (%) \|
	\| --------- \| ----------------- \| ------------------- \| ----------- \| ----------------------------------------- \|
	\| 0 \| 2.3479 \| 1.59398 \| 26.519 \| 83.03 \|
	\| 2 \| 1.5380 \| 1.50034 \| 18.011 \| 87.15 \|
	\| 4 \| 1.4665 \| 1.47125 \| 12.486 \| 91.06 \|
	\| 6 \| 1.4448 \| 1.46236 \| 10.607 \| 91.97 \|
	\| 7 \| 1.4419 \| 1.46210 \| 10.441 \| 92.12 \|