1rsh/gujarati-openslr
Viewer • Updated • 3.58k • 9
How to use 1rsh/whisper-small-gu with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="1rsh/whisper-small-gu") # Load model directly
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
processor = AutoProcessor.from_pretrained("1rsh/whisper-small-gu")
model = AutoModelForSpeechSeq2Seq.from_pretrained("1rsh/whisper-small-gu")This model is a fine-tuned version of vasista22/whisper-gujarati-small on the Gujarati OpenSLR dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
|---|---|---|---|---|---|
| 0.0018 | 4.9505 | 1000 | 0.0472 | 35.3258 | 22.3685 |
In order to infer a single audio file using this model, the following code snippet can be used:
>>> import torch
>>> from transformers import pipeline
>>> # path to the audio file to be transcribed
>>> audio = "/path/to/audio.format"
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
>>> transcribe = pipeline(task="automatic-speech-recognition", model="1rsh/whisper-small-gu", chunk_length_s=30, device=device)
>>> transcribe.model.config.forced_decoder_ids = transcribe.tokenizer.get_decoder_prompt_ids(language="gu", task="transcribe")
>>> print('Transcription: ', transcribe(audio)["text"])
Base model
vasista22/whisper-gujarati-small