aoiandroid commited on
Commit
7734207
·
verified ·
1 Parent(s): 4bf3a53

Add model card (README.md)

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - multilingual
4
+ license: cc-by-nc-4.0
5
+ tags:
6
+ - language-identification
7
+ - onnx
8
+ - audio
9
+ - wav2vec2
10
+ - mms-lid
11
+ datasets:
12
+ - mms-lid
13
+ ---
14
+
15
+ # MMS-LID 256 (ONNX)
16
+
17
+ ONNX export of **MMS-LID** (Massively Multilingual Speech - Language Identification) for **256 languages**. For on-device or server inference without PyTorch.
18
+
19
+ - **Base model:** [facebook/mms-lid-256](https://huggingface.co/facebook/mms-lid-256)
20
+ - **Format:** ONNX
21
+ - **Languages:** 256 (ISO 639-3)
22
+
23
+ ## Contents
24
+
25
+ - ONNX model file(s) for the Wav2Vec2-based LID classifier
26
+ - Label mapping (e.g. `labels.json` or `mms_lid_id2label.json`) for index to language code
27
+
28
+ ## Input / Output
29
+
30
+ - **Input:** Raw waveform, 16 kHz mono, 10 seconds (160,000 samples)
31
+ - **Output:** Logits over 256 language classes; `argmax` gives the predicted language index. Map index to ISO 639-3 code using the included labels file.
32
+
33
+ ## Usage
34
+
35
+ 1. Load the ONNX model with your runtime (e.g. ONNX Runtime, or convert further to Core ML for iOS).
36
+ 2. Feed 10 seconds of 16 kHz mono float32 audio.
37
+ 3. Take `argmax` of the logits output and look up the language code in the labels file.
38
+
39
+ ## Related repos
40
+
41
+ | Languages | Format | Repo |
42
+ |-----------|--------|------|
43
+ | 256 | ONNX | **this repo** |
44
+ | 126 | Core ML | [mms-lid-126-coreml](https://huggingface.co/aoiandroid/mms-lid-126-coreml) |
45
+ | 256 | Core ML | [mms-lid-256-coreml](https://huggingface.co/aoiandroid/mms-lid-256-coreml) |
46
+ | 512 | Core ML | [mms-lid-512-coreml](https://huggingface.co/aoiandroid/mms-lid-512-coreml) |
47
+
48
+ ## Citation
49
+
50
+ ```bibtex
51
+ @article{pratap2023mms,
52
+ title={Scaling Speech Technology to 1,000+ Languages},
53
+ author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
54
+ journal={arXiv},
55
+ year={2023}
56
+ }
57
+ ```
58
+
59
+ ## License
60
+
61
+ CC-BY-NC-4.0 (inherited from MMS-LID).