Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
90.5
TFLOPS
100
22
168
Bram Vanroy
PRO
BramVanroy
Follow
LeoNL's profile picture
sasha's profile picture
musfiqdehan's profile picture
244 followers
·
167 following
https://bramvanroy.github.io/
BramVanroy
BramVanroy
bramvanroy
bramvanroy.bsky.social
AI & ML interests
Artificial intelligence, natural language processing, computational linguistics
Recent Activity
reacted
to
yuriyvnv
's
post
with 🚀
5 days ago
🎯 WAVe-1B-Multimodal-NL: Word-Level Speech Quality Assessment for Dutch Following the release of the Portuguese model, we're releasing the Dutch variant of WAVe — a 1B multimodal embedding model that assesses synthetic speech quality at the word level, thereby improving the quality of synthetically augmented datasets for training ASR models. Trained on CommonVoice 16.1 Dutch with 5 corruption strategies, this model catches mispronunciations, timing errors, and prosody issues in synthetic data that sentence-level embeddings miss entirely. Resources - Dutch model: https://huggingface.co/yuriyvnv/WAVe-1B-Multimodal-NL - Portuguese model: https://huggingface.co/yuriyvnv/WAVe-1B-Multimodal-PT - Code: https://github.com/yuriyvnv/WAVe This model builds on CommonVoice Dutch data — thanks to @mozilla and the CommonVoice community for making multilingual speech data accessible. Would be great to hear from the Dutch NLP community — @BramVanroy @GroNLP — especially if you're working on Dutch ASR or TTS pipelines where quality filtering could help. Also tagging @hf-audio as this sits at the intersection of speech processing and data curation.
liked
a model
24 days ago
hexgrad/Kokoro-82M
liked
a Space
2 months ago
antalvdb/olifant-explainability-demo
View all activity
Organizations
BramVanroy
's datasets
45
Sort: Recently updated
BramVanroy/stack_md_lid
Viewer
•
Updated
Aug 22, 2024
•
21M
•
65
•
4
BramVanroy/Openhermes-2.5-dutch-46k-format
Viewer
•
Updated
Aug 21, 2024
•
43.7k
•
9
BramVanroy/fietje-2-data
Viewer
•
Updated
Jun 4, 2024
•
13.8M
•
11
•
1
BramVanroy/occiglot-fineweb-v0.5-nl
Viewer
•
Updated
Jun 3, 2024
•
16.1M
•
17
•
1
BramVanroy/wiki_simplifications_dutch_dedup_split
Viewer
•
Updated
Apr 10, 2024
•
2.77M
•
9
•
1
BramVanroy/ultra_feedback_dutch_cleaned_multi
Viewer
•
Updated
Mar 27, 2024
•
59.9k
•
7
BramVanroy/HPLT-Dutch-cleaned-v1.2
Viewer
•
Updated
Mar 7, 2024
•
31.7M
•
13
•
1
BramVanroy/dolly-15k-dutch
Viewer
•
Updated
Jan 22, 2024
•
14.3k
•
147
•
3
BramVanroy/alpaca-cleaned-dutch
Viewer
•
Updated
Jan 22, 2024
•
51.3k
•
124
•
10
BramVanroy/stackoverflow-chat-dutch
Viewer
•
Updated
Jan 22, 2024
•
57k
•
67
•
2
BramVanroy/quora-chat-dutch
Viewer
•
Updated
Jan 15, 2024
•
48.8k
•
64
•
2
BramVanroy/dutch_chat_datasets
Viewer
•
Updated
Jan 10, 2024
•
178k
•
107
•
8
BramVanroy/test-dataset-dont-delete
Viewer
•
Updated
Jan 5, 2024
•
4
•
6
BramVanroy/xlwic_wn
Viewer
•
Updated
Oct 2, 2023
•
41.5k
•
28
•
1
BramVanroy/chatgpt-dutch-simplification
Viewer
•
Updated
Jun 19, 2023
•
1.27k
•
85
•
5
Previous
1
2
Next