yuriyvnv/whisper-large-v3-slovenian
Automatic Speech Recognition β’ 2B β’ Updated
β’ 15
π₯ Hello Everyone, given the community's increased interest in the WAVe for the Portuguese Language, the team has retrained the model for over 100 epochs to further extend learning. The results are much better than those from the previous version with 30 epochs.
Key improvements:
| Metric | 30 ep | 100 ep | Change |
|---|---|---|---|
| Loss | 0.49 | 0.22 | -56% |
| Alignment Gap | 0.079 | 0.118 | +49% |
| Corrupt Similarity | 0.31 | 0.23 | -25% |
The biggest win is the alignment gap nearly doubling -- the model is now much better at catching word-level errors like mispronunciations and timing artifacts. Corrupt pairs get
penalized harder (0.23 vs 0.31), so the filtering threshold becomes more reliable.
Same repo, same API, drop-in replacement:
model = AutoModel.from_pretrained("yuriyvnv/WAVe-1B-Multimodal-PT", trust_remote_code=True)
Updated README of the model card includes side-by-side training curves for both versions, check it out.