MMS-LID 512 (Core ML)
Core ML conversion of MMS-LID (Massively Multilingual Speech - Language Identification) for 512 languages. Float16 model for on-device inference on iOS 17+ and macOS.
- Base model: facebook/mms-lid-512
- Format: Core ML (.mlpackage), float16
- Languages: 512 (ISO 639-3)
Contents
- Core ML model (.mlpackage)
labels.jsonormms_lid_id2label.json– Index to language code mapping
Input / Output
- Input: 16 kHz mono float32 audio, 10 seconds (160,000 samples)
- Output: Logits over 512 language classes;
argmaxgives the predicted language index. Map to ISO 639-3 using the labels file.
Usage (iOS / macOS)
- Download this repo (e.g. via Hugging Face Hub or in-app download).
- Load the
.mlpackagewith Core ML; feed 10 seconds of 16 kHz mono audio. - Take
argmaxof the logits output and look up the language code in the labels file.
Quantized variants (same language count)
| Repo | Description |
|---|---|
| this repo | Float16 Core ML |
| mms-lid-512-coreml-4bit | 4-bit palettized (smaller, ANE-friendly) |
| mms-lid-512-coreml-6bit | 6-bit palettized |
| mms-lid-512-coreml-int8 | INT8 quantized |
Related repos
| Languages | Core ML |
|---|---|
| 126 | mms-lid-126-coreml |
| 256 | mms-lid-256-coreml |
| 512 | this repo |
Citation
@article{pratap2023mms,
title={Scaling Speech Technology to 1,000+ Languages},
author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
journal={arXiv},
year={2023}
}
License
CC-BY-NC-4.0 (inherited from MMS-LID).
- Downloads last month
- 22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support