MMS-LID 256 (ONNX)
ONNX export of MMS-LID (Massively Multilingual Speech - Language Identification) for 256 languages. For on-device or server inference without PyTorch.
- Base model: facebook/mms-lid-256
- Format: ONNX
- Languages: 256 (ISO 639-3)
Contents
- ONNX model file(s) for the Wav2Vec2-based LID classifier
- Label mapping (e.g.
labels.jsonormms_lid_id2label.json) for index to language code
Input / Output
- Input: Raw waveform, 16 kHz mono, 10 seconds (160,000 samples)
- Output: Logits over 256 language classes;
argmaxgives the predicted language index. Map index to ISO 639-3 code using the included labels file.
Usage
- Load the ONNX model with your runtime (e.g. ONNX Runtime, or convert further to Core ML for iOS).
- Feed 10 seconds of 16 kHz mono float32 audio.
- Take
argmaxof the logits output and look up the language code in the labels file.
Related repos
| Languages | Format | Repo |
|---|---|---|
| 256 | ONNX | this repo |
| 126 | Core ML | mms-lid-126-coreml |
| 256 | Core ML | mms-lid-256-coreml |
| 512 | Core ML | mms-lid-512-coreml |
Citation
@article{pratap2023mms,
title={Scaling Speech Technology to 1,000+ Languages},
author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
journal={arXiv},
year={2023}
}
License
CC-BY-NC-4.0 (inherited from MMS-LID).
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support