Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
90.5
TFLOPS
100
22
168
Bram Vanroy
PRO
BramVanroy
Follow
abhishek's profile picture
sytse06's profile picture
jeffrey1986's profile picture
243 followers
·
167 following
https://bramvanroy.github.io/
BramVanroy
BramVanroy
bramvanroy
bramvanroy.bsky.social
AI & ML interests
Artificial intelligence, natural language processing, computational linguistics
Recent Activity
reacted
to
yuriyvnv
's
post
with 🚀
2 days ago
🎯 WAVe-1B-Multimodal-NL: Word-Level Speech Quality Assessment for Dutch Following the release of the Portuguese model, we're releasing the Dutch variant of WAVe — a 1B multimodal embedding model that assesses synthetic speech quality at the word level, thereby improving the quality of synthetically augmented datasets for training ASR models. Trained on CommonVoice 16.1 Dutch with 5 corruption strategies, this model catches mispronunciations, timing errors, and prosody issues in synthetic data that sentence-level embeddings miss entirely. Resources - Dutch model: https://huggingface.co/yuriyvnv/WAVe-1B-Multimodal-NL - Portuguese model: https://huggingface.co/yuriyvnv/WAVe-1B-Multimodal-PT - Code: https://github.com/yuriyvnv/WAVe This model builds on CommonVoice Dutch data — thanks to @mozilla and the CommonVoice community for making multilingual speech data accessible. Would be great to hear from the Dutch NLP community — @BramVanroy @GroNLP — especially if you're working on Dutch ASR or TTS pipelines where quality filtering could help. Also tagging @hf-audio as this sits at the intersection of speech processing and data curation.
liked
a model
21 days ago
hexgrad/Kokoro-82M
liked
a Space
about 2 months ago
antalvdb/olifant-explainability-demo
View all activity
Organizations
BramVanroy
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
mistralai/Mistral-Small-3.1-24B-Instruct-2503
3 months ago
regex pattern
7
#84 opened 4 months ago by
pandora-s
New activity in
universalner/universal_ner
4 months ago
Data leakage and duplication
#3 opened 4 months ago by
BramVanroy
Cannot be loaded with datasets
#2 opened 4 months ago by
BramVanroy
New activity in
eriktks/conll2003
4 months ago
Have problem in loading data through load_dataset(), saying "Dataset scripts are no longer supported, but found conll2003.py"
👍
1
5
#13 opened 8 months ago by
Jiebro02
New activity in
instituutnederlandsetaal/DuidelijkeTaal-v1.0
5 months ago
[bot] Conversion to Parquet
#1 opened 6 months ago by
parquet-converter
New activity in
nlptown/sentiment
6 months ago
Space not working
#1 opened 6 months ago by
BramVanroy
New activity in
HuggingFaceH4/Multilingual-Thinking
7 months ago
Translation code?
#3 opened 7 months ago by
BramVanroy
New activity in
institutional/institutional-books-1.0
9 months ago
License discussion
👍
1
9
#2 opened 9 months ago by
PatentPilotAI
New activity in
Lightricks/ltx-video-distilled
10 months ago
Link sharing flawed
👍
3
2
#6 opened 10 months ago by
BramVanroy
New activity in
BramVanroy/fineweb-duckdbs
10 months ago
Abusing tf out of xet privs
4
#1 opened 10 months ago by
ZennyKenny
New activity in
BramVanroy/CommonCrawl-CreativeCommons
11 months ago
[bot] Conversion to Parquet
#1 opened about 1 year ago by
parquet-converter
New activity in
utter-project/EuroLLM-9B-Instruct
11 months ago
Training data
9
#2 opened over 1 year ago by
BramVanroy
New activity in
BramVanroy/mateo-demo
about 1 year ago
Apply for community grant: Academic project
1
#1 opened almost 3 years ago by
BramVanroy
New activity in
reach-vb/GPT-4.5-System-Card
about 1 year ago
Upload gpt-4-5-system-card.pdf
1
#1 opened about 1 year ago by
reach-vb
New activity in
BramVanroy/WildChat-1M-filtered-gpt-4
about 1 year ago
Add language tag
❤️
1
1
#2 opened about 1 year ago by
lbourdois
commented
a paper
about 1 year ago
Fietje: An open, efficient LLM for Dutch
Paper
•
2412.15450
•
Published
Dec 19, 2024
•
4
•
3
New activity in
GroNLP/dutch-cola
about 1 year ago
Citation
2
#2 opened about 1 year ago by
BramVanroy
New activity in
ml6team/README
over 1 year ago
Open source CC-BY dataset and classifier?
👀
1
9
#1 opened over 1 year ago by
burtenshaw
New activity in
HPLT/hplt_bert_base_fr
over 1 year ago
Adding `safetensors` variant of this model
3
#1 opened over 1 year ago by
SFconvertbot
New activity in
instituutnederlandsetaal/galahad-corpus-data-v1.0.1
over 1 year ago
Librarian Bot: Add language metadata for dataset
#2 opened over 1 year ago by
librarian-bot
Load more