aoiandroid commited on
Commit
5358003
·
verified ·
1 Parent(s): a591ef5

Add model card (README.md)

Browse files
Files changed (1) hide show
  1. README.md +57 -9
README.md CHANGED
@@ -1,21 +1,69 @@
1
  ---
 
 
2
  license: cc-by-nc-4.0
3
- base_model: facebook/mms-lid-126
4
  tags:
5
- - coreml
6
  - language-identification
 
 
7
  - audio
8
  - wav2vec2
 
 
 
9
  ---
10
 
11
- # MMS-LID 126 CoreML
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
- CoreML export of [facebook/mms-lid-126](https://huggingface.co/facebook/mms-lid-126) for iOS/macOS.
14
 
15
- - **mms_lid.mlmodel**: CoreML Neural Network (fixed 10s @ 16 kHz mono input, logits output).
16
- - **mms_lid_id2label.json**: Language ID to label mapping.
 
 
 
 
 
 
17
 
18
- Input: raw waveform float32, shape `(1, 160000)` (16 kHz, 10 seconds).
19
- Output: `logits` shape `(1, 126)`; use `argmax` then `id2label[id]` for language code.
20
 
21
- See the conversion repo docs for full input/output spec.
 
1
  ---
2
+ language:
3
+ - multilingual
4
  license: cc-by-nc-4.0
 
5
  tags:
 
6
  - language-identification
7
+ - coreml
8
+ - ios
9
  - audio
10
  - wav2vec2
11
+ - mms-lid
12
+ datasets:
13
+ - mms-lid
14
  ---
15
 
16
+ # MMS-LID 126 (Core ML)
17
+
18
+ Core ML conversion of **MMS-LID** (Massively Multilingual Speech - Language Identification) for **126 languages**. Float16 model for on-device inference on iOS 17+ and macOS.
19
+
20
+ - **Base model:** [facebook/mms-lid-126](https://huggingface.co/facebook/mms-lid-126)
21
+ - **Format:** Core ML (.mlpackage), float16
22
+ - **Languages:** 126 (ISO 639-3)
23
+
24
+ ## Contents
25
+
26
+ - `mms_lid.mlpackage` (or equivalent) – Core ML model
27
+ - `labels.json` or `mms_lid_id2label.json` – Index to language code mapping
28
+
29
+ ## Input / Output
30
+
31
+ - **Input:** 16 kHz mono float32 audio, 10 seconds (160,000 samples), e.g. `input_values`
32
+ - **Output:** Logits over 126 language classes; `argmax` gives the predicted language index. Map to ISO 639-3 using the labels file.
33
+
34
+ ## Usage (iOS / macOS)
35
+
36
+ 1. Download this repo (e.g. via Hugging Face Hub or in-app download).
37
+ 2. Load the `.mlpackage` with Core ML; feed 10 seconds of 16 kHz mono audio.
38
+ 3. Take `argmax` of the logits output and look up the language code in `labels.json` or `mms_lid_id2label.json`.
39
+
40
+ ## Quantized variants (same language count)
41
+
42
+ | Repo | Description |
43
+ |------|-------------|
44
+ | **this repo** | Float16 Core ML |
45
+ | [mms-lid-126-coreml-4bit](https://huggingface.co/aoiandroid/mms-lid-126-coreml-4bit) | 4-bit palettized (smaller, ANE-friendly) |
46
+ | [mms-lid-126-coreml-int8](https://huggingface.co/aoiandroid/mms-lid-126-coreml-int8) | INT8 quantized |
47
+
48
+ ## Related repos (other language counts)
49
+
50
+ | Languages | Core ML |
51
+ |-----------|---------|
52
+ | 126 | **this repo** |
53
+ | 256 | [mms-lid-256-coreml](https://huggingface.co/aoiandroid/mms-lid-256-coreml) |
54
+ | 512 | [mms-lid-512-coreml](https://huggingface.co/aoiandroid/mms-lid-512-coreml) |
55
 
56
+ ## Citation
57
 
58
+ ```bibtex
59
+ @article{pratap2023mms,
60
+ title={Scaling Speech Technology to 1,000+ Languages},
61
+ author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
62
+ journal={arXiv},
63
+ year={2023}
64
+ }
65
+ ```
66
 
67
+ ## License
 
68
 
69
+ CC-BY-NC-4.0 (inherited from MMS-LID).