--- license: cc-by-nc-4.0 tags: - coreml - translation - nllb - multilingual - on-device - iOS - macOS library_name: coremltools base_model: facebook/nllb-200-distilled-600M --- # NLLB-200 CoreML (128 tokens) On-device neural machine translation for **200 languages** using CoreML on Apple devices (iPhone, iPad, Mac). This is a CoreML conversion of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) optimized for: - βœ… Fast on-device inference - βœ… GPU/Neural Engine acceleration - βœ… 128-token context (β‰ˆ80-100 words) ## πŸ“¦ What's Included ``` . β”œβ”€β”€ NLLB_Encoder_128.mlpackage # Encoder model (~1.5 GB) β”œβ”€β”€ NLLB_Decoder_128.mlpackage # Decoder model (~1.7 GB) β”œβ”€β”€ tokenizer/ # Tokenizer files β”œβ”€β”€ example.py # Ready-to-run example └── language_codes.json # Language code reference ``` ## πŸš€ Quick Start ### Installation ```bash pip install coremltools transformers ``` ### Download Models ```bash # Clone this repo git lfs install git clone https://huggingface.co/cstr/nllb-200-coreml-128 cd nllb-200-coreml-128 ``` ### Run Translation ```python from example import translate_text # English to German result = translate_text( "Hello, how are you today?", source_lang="eng_Latn", target_lang="deu_Latn" ) print(result) # "Hallo, wie geht es dir heute?" ``` ## πŸ’‘ Usage Examples ### Multiple Languages ```python from example import translate_text # English β†’ Spanish translate_text("Good morning!", "eng_Latn", "spa_Latn") # β†’ "Β‘Buenos dΓ­as!" # French β†’ English translate_text("Bonjour le monde", "fra_Latn", "eng_Latn") # β†’ "Hello world" # Japanese β†’ English translate_text("こんにけは", "jpn_Jpan", "eng_Latn") # β†’ "Hello" ``` ### Production Usage ```python import coremltools as ct from transformers import AutoTokenizer class Translator: def __init__(self): # Load once, reuse for all translations self.encoder = ct.models.MLModel( "NLLB_Encoder_128.mlpackage", compute_units=ct.ComputeUnit.ALL # Use GPU ) self.decoder = ct.models.MLModel( "NLLB_Decoder_128.mlpackage", compute_units=ct.ComputeUnit.ALL ) self.tokenizer = AutoTokenizer.from_pretrained("./tokenizer") def translate(self, text, src_lang, tgt_lang): # Your translation logic here pass # Create once translator = Translator() # Reuse many times (fast!) translator.translate("Hello", "eng_Latn", "deu_Latn") translator.translate("Goodbye", "eng_Latn", "fra_Latn") ``` ## 🌍 Supported Languages See `language_codes.json` for the full list of 200+ languages. Common examples: | Language | Code | |----------|------| | English | `eng_Latn` | | German | `deu_Latn` | | French | `fra_Latn` | | Spanish | `spa_Latn` | | Chinese (Simplified) | `zho_Hans` | | Japanese | `jpn_Jpan` | | Arabic | `arb_Arab` | | Russian | `rus_Cyrl` | Full list: [NLLB Language Codes](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200) ## βš™οΈ Technical Details - **Max Tokens**: 128 (β‰ˆ80-100 words depending on language) - **Precision**: FLOAT16 - **Compute**: CPU + GPU + Neural Engine - **Base Model**: facebook/nllb-200-distilled-600M ## πŸ”§ Advanced Options ### CPU-Only Mode ```python encoder = ct.models.MLModel( "NLLB_Encoder_128.mlpackage", compute_units=ct.ComputeUnit.CPU_ONLY ) ``` ### Batch Processing ```python texts = ["Hello", "Goodbye", "Thank you"] translations = [translate_text(t, "eng_Latn", "deu_Latn") for t in texts] ``` ## ⚠️ Limitations - **128 token limit**: Longer text is truncated (~80-100 words) - **Quality**: Distilled model, slightly lower quality than full NLLB-3.3B - **Low-resource languages**: May have reduced accuracy - **No streaming**: Complete sentence processing only ## πŸ“ License - **Models**: CC-BY-NC-4.0 (inherited from NLLB-200) - **Code**: MIT ⚠️ **Non-commercial use only** per NLLB license ```