legacy-datasets/common_voice
Updated • 1.53k • 144
How to use rossevine/Model_G_Wav2Vec2_Versi1 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="rossevine/Model_G_Wav2Vec2_Versi1") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("rossevine/Model_G_Wav2Vec2_Versi1")
model = AutoModelForCTC.from_pretrained("rossevine/Model_G_Wav2Vec2_Versi1")This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
|---|---|---|---|---|---|
| 3.7506 | 5.97 | 400 | 0.6970 | 1.0124 | 0.7636 |
| 0.3678 | 11.94 | 800 | 0.4711 | 1.0290 | 0.7660 |
| 0.1612 | 17.91 | 1200 | 0.4492 | 1.0007 | 0.7606 |
| 0.1056 | 23.88 | 1600 | 0.4012 | 1.0040 | 0.7658 |
| 0.0693 | 29.85 | 2000 | 0.4072 | 1.0101 | 0.7622 |