legacy-datasets/common_voice
Updated • 1.52k • 144
How to use jiobiala24/wav2vec2-base-cv with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="jiobiala24/wav2vec2-base-cv") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("jiobiala24/wav2vec2-base-cv")
model = AutoModelForCTC.from_pretrained("jiobiala24/wav2vec2-base-cv")This model is a fine-tuned version of facebook/wav2vec2-base on the common_voice dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 4.563 | 3.18 | 500 | 2.9826 | 1.0 |
| 2.0012 | 6.37 | 1000 | 0.9528 | 0.5354 |
| 0.4841 | 9.55 | 1500 | 0.8838 | 0.4325 |
| 0.2748 | 12.74 | 2000 | 0.9437 | 0.4130 |
| 0.1881 | 15.92 | 2500 | 0.9603 | 0.4005 |
| 0.1426 | 19.11 | 3000 | 1.0605 | 0.3955 |
| 0.1134 | 22.29 | 3500 | 1.0733 | 0.3897 |
| 0.0963 | 25.48 | 4000 | 1.1387 | 0.3835 |
| 0.0829 | 28.66 | 4500 | 1.1562 | 0.3804 |