iFaz/Whisper_Compatible_SER_benchmark
Viewer • Updated • 31.4k • 22
How to use iFaz/whisper-SER-base-v7 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="iFaz/whisper-SER-base-v7") # Load model directly
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
processor = AutoProcessor.from_pretrained("iFaz/whisper-SER-base-v7")
model = AutoModelForSpeechSeq2Seq.from_pretrained("iFaz/whisper-SER-base-v7")This model is a fine-tuned version of openai/whisper-base on the Whisper_Compatible_SER_benchmark + enhanced_facebook_voxpopulik_16k_Whisper_Compatible dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 0.3141 | 0.5510 | 1000 | 0.3218 | 42.8881 |
| 0.1626 | 1.1019 | 2000 | 0.2021 | 58.5652 |
| 0.1553 | 1.6529 | 3000 | 0.1462 | 87.1676 |
| 0.1091 | 2.2039 | 4000 | 0.1199 | 63.8528 |
| 0.1069 | 2.7548 | 5000 | 0.1027 | 63.3271 |
| 0.042 | 3.3058 | 6000 | 0.0958 | 66.8831 |
| 0.0434 | 3.8567 | 7000 | 0.0935 | 77.2418 |
| 0.0254 | 4.4077 | 8000 | 0.0926 | 64.4712 |
| 0.0265 | 4.9587 | 9000 | 0.0939 | 59.9876 |
| 0.0136 | 5.5096 | 10000 | 0.0955 | 58.2870 |
| 0.009 | 6.0606 | 11000 | 0.0985 | 62.9561 |
| 0.0067 | 6.6116 | 12000 | 0.0978 | 56.9573 |
Base model
openai/whisper-base