Commit
·
7116a2a
1
Parent(s):
b2c866d
Update README.md
Browse files
README.md
CHANGED
|
@@ -8,6 +8,17 @@ tags:
|
|
| 8 |
|
| 9 |
This repository consists of the n-gram language models trained on Common Crawl data ([Conneau et al. 2020b](https://aclanthology.org/2020.acl-main.747/), [NLLB_Team et al. 2022](https://arxiv.org/abs/2207.04672)) using [KenLM library](https://github.com/kpu/kenlm).
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
## Table Of Content
|
| 12 |
|
| 13 |
- [Example](#example)
|
|
@@ -17,10 +28,8 @@ This repository consists of the n-gram language models trained on Common Crawl d
|
|
| 17 |
|
| 18 |
## Example
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
-
TODO
|
| 23 |
-
```
|
| 24 |
|
| 25 |
## Supported Languages
|
| 26 |
|
|
|
|
| 8 |
|
| 9 |
This repository consists of the n-gram language models trained on Common Crawl data ([Conneau et al. 2020b](https://aclanthology.org/2020.acl-main.747/), [NLLB_Team et al. 2022](https://arxiv.org/abs/2207.04672)) using [KenLM library](https://github.com/kpu/kenlm).
|
| 10 |
|
| 11 |
+
|
| 12 |
+
For the following languages, the LMs are not present in the repository (due to 50GB limit on HuggingFace) and can be downloaded using the link provided here.
|
| 13 |
+
|
| 14 |
+
Mandarin Chinese (Simplified) - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/cmn-script_simplified/char_20gram.bin)
|
| 15 |
+
|
| 16 |
+
Japanese - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/jpn/char_20gram.bin)
|
| 17 |
+
|
| 18 |
+
Thai - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/tha/char_20gram.bin)
|
| 19 |
+
|
| 20 |
+
Cantonese(Traditional) - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/yue-script_traditional/char_20gram.bin)
|
| 21 |
+
|
| 22 |
## Table Of Content
|
| 23 |
|
| 24 |
- [Example](#example)
|
|
|
|
| 28 |
|
| 29 |
## Example
|
| 30 |
|
| 31 |
+
Checkout the code here - https://huggingface.co/spaces/mms-meta/MMS/blob/main/asr.py which uses LMs for decoding the output from ASR models.
|
| 32 |
|
|
|
|
|
|
|
| 33 |
|
| 34 |
## Supported Languages
|
| 35 |
|