--- license: cc-by-nc-4.0 tags: - coreml - translation - nllb - multilingual - on-device - iOS - macOS library_name: coremltools base_model: facebook/nllb-200-distilled-600M --- # NLLB-200 CoreML (256 tokens) On-device neural machine translation for **200 languages** using CoreML on Apple devices (iPhone, iPad, Mac). This is a CoreML conversion of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) optimized for: - ✅ Fast on-device inference - ✅ GPU/Neural Engine acceleration - ✅ 256-token context (≈150-180 words) - ✅ 2X longer context vs 128-token version ## 📦 What's Included ``` . ├── NLLB_Encoder_256.mlpackage # Encoder model (~1.5 GB) ├── NLLB_Decoder_256.mlpackage # Decoder model (~1.7 GB) ├── tokenizer/ # Tokenizer files ├── example.py # Ready-to-run example └── language_codes.json # Language code reference ``` ## 🚀 Quick Start ### Installation ```bash pip install coremltools transformers ``` ### Download Models ```bash # Clone this repo git lfs install git clone https://huggingface.co/cstr/nllb-200-coreml-256 cd nllb-200-coreml-256 ``` ### Run Translation ```python from example import translate_text # English to German result = translate_text( "Hello, how are you today?", source_lang="eng_Latn", target_lang="deu_Latn" ) print(result) # "Hallo, wie geht es dir heute?" ``` ## 💡 Usage Examples ### Multiple Languages ```python from example import translate_text # English → Spanish translate_text("Good morning!", "eng_Latn", "spa_Latn") # → "¡Buenos días!" # French → English translate_text("Bonjour le monde", "fra_Latn", "eng_Latn") # → "Hello world" # Japanese → English translate_text("こんにちは", "jpn_Jpan", "eng_Latn") # → "Hello" ``` ### Long Text Translation ```python # 256-token context handles longer paragraphs long_text = """ Machine learning is a subset of artificial intelligence that enables computers to learn and improve from experience without being explicitly programmed. In recent years, it has transformed technology and created new possibilities. """ result = translate_text(long_text, "eng_Latn", "deu_Latn") ``` ### Production Usage ```python import coremltools as ct from transformers import AutoTokenizer class Translator: def __init__(self): # Load once, reuse for all translations self.encoder = ct.models.MLModel( "NLLB_Encoder_256.mlpackage", compute_units=ct.ComputeUnit.ALL # Use GPU ) self.decoder = ct.models.MLModel( "NLLB_Decoder_256.mlpackage", compute_units=ct.ComputeUnit.ALL ) self.tokenizer = AutoTokenizer.from_pretrained("./tokenizer") def translate(self, text, src_lang, tgt_lang): # Your translation logic here pass # Create once translator = Translator() # Reuse many times (fast!) translator.translate("Hello", "eng_Latn", "deu_Latn") translator.translate("Goodbye", "eng_Latn", "fra_Latn") ``` ## 🌍 Supported Languages See `language_codes.json` for the full list of 200+ languages. Common examples: | Language | Code | |----------|------| | English | `eng_Latn` | | German | `deu_Latn` | | French | `fra_Latn` | | Spanish | `spa_Latn` | | Chinese (Simplified) | `zho_Hans` | | Japanese | `jpn_Jpan` | | Arabic | `arb_Arab` | | Russian | `rus_Cyrl` | Full list: [NLLB Language Codes](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200) ## ⚙️ Technical Details - **Max Tokens**: 256 (≈150-180 words depending on language) - **Precision**: FLOAT16 - **Compute**: CPU + GPU + Neural Engine - **Base Model**: facebook/nllb-200-distilled-600M - **Model Size**: ~3.2 GB total (encoder + decoder) ## 🔧 Advanced Options ### CPU-Only Mode ```python encoder = ct.models.MLModel( "NLLB_Encoder_256.mlpackage", compute_units=ct.ComputeUnit.CPU_ONLY ) ``` ### Batch Processing ```python texts = ["Hello", "Goodbye", "Thank you"] translations = [translate_text(t, "eng_Latn", "deu_Latn") for t in texts] ``` ## 📊 Comparison with 128-Token Version | Feature | 128-Token | 256-Token (This) | |---------|-----------|------------------| | Max Length | ~80-100 words | **~150-180 words** | | Model Size | ~3.2 GB | ~3.2 GB | | Speed | Faster | Slightly slower | | Use Case | Short texts, chat | **Paragraphs, articles** | ## ⚠️ Limitations - **256 token limit**: Longer text is truncated (~150-180 words) - **Quality**: Distilled model, slightly lower quality than full NLLB-3.3B - **Low-resource languages**: May have reduced accuracy - **No streaming**: Complete sentence processing only ## 📝 License - **Models**: CC-BY-NC-4.0 (inherited from NLLB-200) - **Code**: MIT ⚠️ **Non-commercial use only** per NLLB license ## 🔗 Related Models - [128-token version](https://huggingface.co/cstr/nllb-200-coreml-128) - Faster for short texts ```