Instructions to use espnet/WavLabLM-MS-40k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ESPnet
How to use espnet/WavLabLM-MS-40k with ESPnet:
unknown model type (must be text-to-speech or automatic-speech-recognition)
- Notebooks
- Google Colab
- Kaggle
Commit ·
6b2f84f
1
Parent(s): bcbd17d
Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,11 @@ license: cc-by-4.0
|
|
| 14 |
|
| 15 |
## WavLabLM-MS 40k
|
| 16 |
|
| 17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
|
| 20 |
|
|
|
|
| 14 |
|
| 15 |
## WavLabLM-MS 40k
|
| 16 |
|
| 17 |
+
[Paper](https://arxiv.org/abs/2309.15317)
|
| 18 |
+
|
| 19 |
+
This model was trained by [William Chen](https://wanchichen.github.io/) using ESPNet2's SSL recipe in [espnet](https://github.com/espnet/espnet/).
|
| 20 |
+
WavLabLM is an self-supervised audio encoder pre-trained on 40,000 hours of multilingual data across 136 languages. This specific variant, WavLabLM-MS, went through a second stage of pre-training on a balanced subset of the data to improve performance on lower-resource languages.
|
| 21 |
+
It achieves comparable performance to XLS-R 128 on the [ML-SUPERB Benchmark](https://arxiv.org/abs/2305.10615) with only 10% of the pre-training data.
|
| 22 |
|
| 23 |
|
| 24 |
|