Whisper Small Bengali
This is a fine-tuned Whisper Small model for Bengali (Bangla) speech recognition.
Model Details
- Base Model: openai/whisper-small
- Language: Bengali (bn)
- Training Steps: 2000
- Final Training Loss: N/A
Usage
import torch
from transformers import pipeline
# choose device
device = "cuda:0" if torch.cuda.is_available() else "cpu"
# create pipeline
asr = pipeline(
"automatic-speech-recognition",
model="vivasoft/whisper-small-bn",
chunk_length_s=30,
device=device
)
asr.model.config.forced_decoder_ids = asr.tokenizer.get_decoder_prompt_ids(
language="bn",
task="transcribe"
)
# load your audio file path (must be compatible, e.g., WAV/MP3)
audio_file = "/content/yt-3.mp3"
# run transcription
result = asr(audio_file)
print("Transcription:", result["text"])
Training Details
- Training Data: openslr37
- Language: Bengali (bn)
- Training Steps: 2000
- Batch Size: 4
- Learning Rate: 1e-05
- Optimizer: AdamW
- eval_wer: 0.3080158337456705
Limitations
- Optimized for Bengali speech only
- Works best with clear audio at 16kHz sampling rate
- May not perform well on heavily accented or noisy audio
Acknowledgments
Based on OpenAI's Whisper model: https://github.com/openai/whisper
- Downloads last month
- 8