MMS-LID 512 (Core ML, 6-bit)

Core ML conversion of MMS-LID for 512 languages, 6-bit palettized for a balance of size and accuracy on iPad/Mac.

  • Base model: facebook/mms-lid-512
  • Format: Core ML (.mlpackage), 6-bit palettized
  • Languages: 512 (ISO 639-3)

Contents

  • 6-bit palettized Core ML model (.mlpackage)
  • Label mapping file(s) for index to language code

Input / Output

  • Input: 16 kHz mono float32 audio, 10 seconds (160,000 samples)
  • Output: Logits over 512 language classes; argmax then look up ISO 639-3 in labels file.

Related

Variant Repo
Float16 mms-lid-512-coreml
4-bit mms-lid-512-coreml-4bit
6-bit this repo
INT8 mms-lid-512-coreml-int8

Citation

@article{pratap2023mms,
  title={Scaling Speech Technology to 1,000+ Languages},
  author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
  journal={arXiv},
  year={2023}
}

License

CC-BY-NC-4.0 (inherited from MMS-LID).

Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support