Instructions to use utter-project/mHuBERT-147 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use utter-project/mHuBERT-147 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="utter-project/mHuBERT-147")# Load model directly from transformers import AutoProcessor, AutoModel processor = AutoProcessor.from_pretrained("utter-project/mHuBERT-147") model = AutoModel.from_pretrained("utter-project/mHuBERT-147") - Inference
- Notebooks
- Google Colab
- Kaggle
Comparison to w2v-bert-2.0
#13
by anferico - opened
Hi, how does this model compare to wav2vec-BERT 2.0? mHuBERT is pre-trained on a total of 90K hours of speech and 147 languages, whereas wav2vec-BERT 2.0 was pre-trained on 4.5M hours of unlabeled speech covering more than 143 languages. Disregarding parameter count (and therefore inference time), wav2vec-BERT 2.0 has a striking advantage on paper, so I would be curious to know if any performance comparisons were carried out.
Hi! Thanks for the interest in our model! I unfortunately did not have time to investigate this. Their model was released more or less at the same time we were finishing our experiments. If you ever benchmark them, don't hesitate to share it. :)