Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
yuriyvnvΒ 
posted an update 19 days ago
Post
342
🎯 WAVe-1B-Multimodal-NL: Word-Level Speech Quality Assessment for Dutch

Following the release of the Portuguese model, we're releasing the Dutch variant of WAVe β€” a 1B multimodal embedding model that assesses synthetic speech quality at the word level, thereby improving the quality of synthetically augmented datasets for training ASR models.

Trained on CommonVoice 16.1 Dutch with 5 corruption strategies, this model catches mispronunciations, timing errors, and prosody issues in synthetic data that sentence-level embeddings miss entirely.
Resources

- Dutch model: yuriyvnv/WAVe-1B-Multimodal-NL
- Portuguese model: yuriyvnv/WAVe-1B-Multimodal-PT
- Code: https://github.com/yuriyvnv/WAVe

This model builds on CommonVoice Dutch data β€” thanks to @mozilla and the CommonVoice community for making multilingual speech data accessible.

Would be great to hear from the Dutch NLP community β€” @BramVanroy @GroNLP β€” especially if you're working on Dutch ASR or TTS pipelines where quality filtering could help. Also tagging @hf-audio as this sits at the intersection of speech processing and data curation.
In this post