aoiandroid
/

mms-lid-256-coreml

language-identification

Model card Files Files and versions

mms-lid-256-coreml / README.md

aoiandroid's picture

Add model card (README.md)

6ce5b24 verified about 1 month ago

|

history blame contribute delete

2.24 kB

	---
	language:
	- multilingual
	license: cc-by-nc-4.0
	tags:
	- language-identification
	- coreml
	- ios
	- audio
	- wav2vec2
	- mms-lid
	datasets:
	- mms-lid
	---

	# MMS-LID 256 (Core ML)

	Core ML conversion of MMS-LID (Massively Multilingual Speech - Language Identification) for 256 languages. Float16 model for on-device inference on iOS 17+ and macOS.

	- Base model: [facebook/mms-lid-256](https://huggingface.co/facebook/mms-lid-256)
	- Format: Core ML (.mlpackage), float16
	- Languages: 256 (ISO 639-3)

	## Contents

	- Core ML model (.mlpackage)
	- `labels.json` or `mms_lid_id2label.json` – Index to language code mapping

	## Input / Output

	- Input: 16 kHz mono float32 audio, 10 seconds (160,000 samples)
	- Output: Logits over 256 language classes; `argmax` gives the predicted language index. Map to ISO 639-3 using the labels file.

	## Usage (iOS / macOS)

	1. Download this repo (e.g. via Hugging Face Hub or in-app download).
	2. Load the `.mlpackage` with Core ML; feed 10 seconds of 16 kHz mono audio.
	3. Take `argmax` of the logits output and look up the language code in the labels file.

	## Quantized variants (same language count)

	\| Repo \| Description \|
	\|------\|-------------\|
	\| this repo \| Float16 Core ML \|
	\| [mms-lid-256-coreml-4bit](https://huggingface.co/aoiandroid/mms-lid-256-coreml-4bit) \| 4-bit palettized (smaller, ANE-friendly) \|

	## Related repos

	\| Languages \| ONNX \| Core ML \|
	\|-----------\|------\|---------\|
	\| 256 \| [mms-lid-256-onnx](https://huggingface.co/aoiandroid/mms-lid-256-onnx) \| this repo \|
	\| 126 \| – \| [mms-lid-126-coreml](https://huggingface.co/aoiandroid/mms-lid-126-coreml) \|
	\| 512 \| – \| [mms-lid-512-coreml](https://huggingface.co/aoiandroid/mms-lid-512-coreml) \|

	## Citation

	```bibtex
	@article{pratap2023mms,
	title={Scaling Speech Technology to 1,000+ Languages},
	author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
	journal={arXiv},
	year={2023}
	}
	```

	## License

	CC-BY-NC-4.0 (inherited from MMS-LID).