openslr/openslr
Updated • 603 • 29
How to use ksoky/whisper-small-km with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="ksoky/whisper-small-km") # Load model directly
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
processor = AutoProcessor.from_pretrained("ksoky/whisper-small-km")
model = AutoModelForSpeechSeq2Seq.from_pretrained("ksoky/whisper-small-km")This model is a fine-tuned version of openai/whisper-small on the SLR42 dataset. It achieves the following results on the evaluation set:
The model was fine-tuned on both encoder-decoder of transformer-based.
The training data is limited, thus the performance is also limited to only reading speech and a limited domain (tourism).
The training and evaluation data was split in a 9:1 ratio from Google Text-to-speech corpus.
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 0.3639 | 0.76 | 1000 | 0.3452 | 71.9392 |
| 0.1553 | 1.53 | 2000 | 0.2025 | 49.0494 |
| 0.0565 | 2.29 | 3000 | 0.1664 | 39.9240 |
| 0.0334 | 3.06 | 4000 | 0.1471 | 35.6654 |