Comparison to w2v-bert-2.0

#13

by anferico - opened Apr 28, 2025

Apr 28, 2025

Hi, how does this model compare to wav2vec-BERT 2.0? mHuBERT is pre-trained on a total of 90K hours of speech and 147 languages, whereas wav2vec-BERT 2.0 was pre-trained on 4.5M hours of unlabeled speech covering more than 143 languages. Disregarding parameter count (and therefore inference time), wav2vec-BERT 2.0 has a striking advantage on paper, so I would be curious to know if any performance comparisons were carried out.

mzboito

UTTER - Unified Transcription and Translation for Extended Reality org May 7, 2025

Hi! Thanks for the interest in our model! I unfortunately did not have time to investigate this. Their model was released more or less at the same time we were finishing our experiments. If you ever benchmark them, don't hesitate to share it. :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment