MMS-LID 256 (Core ML)

Core ML conversion of MMS-LID (Massively Multilingual Speech - Language Identification) for 256 languages. Float16 model for on-device inference on iOS 17+ and macOS.

Contents

  • Core ML model (.mlpackage)
  • labels.json or mms_lid_id2label.json โ€“ Index to language code mapping

Input / Output

  • Input: 16 kHz mono float32 audio, 10 seconds (160,000 samples)
  • Output: Logits over 256 language classes; argmax gives the predicted language index. Map to ISO 639-3 using the labels file.

Usage (iOS / macOS)

  1. Download this repo (e.g. via Hugging Face Hub or in-app download).
  2. Load the .mlpackage with Core ML; feed 10 seconds of 16 kHz mono audio.
  3. Take argmax of the logits output and look up the language code in the labels file.

Quantized variants (same language count)

Repo Description
this repo Float16 Core ML
mms-lid-256-coreml-4bit 4-bit palettized (smaller, ANE-friendly)

Related repos

Languages ONNX Core ML
256 mms-lid-256-onnx this repo
126 โ€“ mms-lid-126-coreml
512 โ€“ mms-lid-512-coreml

Citation

@article{pratap2023mms,
  title={Scaling Speech Technology to 1,000+ Languages},
  author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
  journal={arXiv},
  year={2023}
}

License

CC-BY-NC-4.0 (inherited from MMS-LID).

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support