evanarlian/common_voice_11_0_id_filtered
Viewer • Updated • 38.6k • 158
How to use AhBotNLP/wav2vec2-xls-r-164m-id with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="AhBotNLP/wav2vec2-xls-r-164m-id") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("AhBotNLP/wav2vec2-xls-r-164m-id")
model = AutoModelForCTC.from_pretrained("AhBotNLP/wav2vec2-xls-r-164m-id")This model is a fine-tuned version of evanarlian/distil-wav2vec2-xls-r-164m-id on the evanarlian/common_voice_11_0_id_filtered dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 1.4047 | 4.59 | 5000 | 1.0167 | 0.9138 |
| 0.587 | 9.18 | 10000 | 0.4639 | 0.5615 |
| 0.3782 | 13.77 | 15000 | 0.3375 | 0.4496 |
| 0.2867 | 18.37 | 20000 | 0.2881 | 0.4022 |
| 0.2519 | 22.96 | 25000 | 0.2775 | 0.3700 |
| 0.1941 | 27.55 | 30000 | 0.2701 | 0.3516 |
| 0.1727 | 32.14 | 35000 | 0.2795 | 0.3486 |
| 0.1448 | 36.73 | 40000 | 0.2878 | 0.3364 |
| 0.1251 | 41.32 | 45000 | 0.2649 | 0.3275 |
| 0.113 | 45.91 | 50000 | 0.2862 | 0.3168 |
| 0.0994 | 50.51 | 55000 | 0.2798 | 0.3091 |
| 0.0938 | 55.1 | 60000 | 0.2864 | 0.3070 |
| 0.0853 | 59.69 | 65000 | 0.2860 | 0.3069 |
| 0.0724 | 64.28 | 70000 | 0.2994 | 0.3003 |
| 0.0723 | 68.87 | 75000 | 0.2951 | 0.2983 |
| 0.0666 | 73.46 | 80000 | 0.2886 | 0.2941 |
| 0.0659 | 78.05 | 85000 | 0.2865 | 0.2923 |