nllb-200-coreml-128 / README.md
cstr's picture
Upload NLLB-200 CoreML 128-token models with tokenizer
eac04e2 verified
---
license: cc-by-nc-4.0
tags:
- coreml
- translation
- nllb
- multilingual
- on-device
- iOS
- macOS
library_name: coremltools
base_model: facebook/nllb-200-distilled-600M
---
# NLLB-200 CoreML (128 tokens)
On-device neural machine translation for **200 languages** using CoreML on Apple devices (iPhone, iPad, Mac).
This is a CoreML conversion of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) optimized for:
- βœ… Fast on-device inference
- βœ… GPU/Neural Engine acceleration
- βœ… 128-token context (β‰ˆ80-100 words)
## πŸ“¦ What's Included
```
.
β”œβ”€β”€ NLLB_Encoder_128.mlpackage # Encoder model (~1.5 GB)
β”œβ”€β”€ NLLB_Decoder_128.mlpackage # Decoder model (~1.7 GB)
β”œβ”€β”€ tokenizer/ # Tokenizer files
β”œβ”€β”€ example.py # Ready-to-run example
└── language_codes.json # Language code reference
```
## πŸš€ Quick Start
### Installation
```bash
pip install coremltools transformers
```
### Download Models
```bash
# Clone this repo
git lfs install
git clone https://huggingface.co/cstr/nllb-200-coreml-128
cd nllb-200-coreml-128
```
### Run Translation
```python
from example import translate_text
# English to German
result = translate_text(
"Hello, how are you today?",
source_lang="eng_Latn",
target_lang="deu_Latn"
)
print(result) # "Hallo, wie geht es dir heute?"
```
## πŸ’‘ Usage Examples
### Multiple Languages
```python
from example import translate_text
# English β†’ Spanish
translate_text("Good morning!", "eng_Latn", "spa_Latn")
# β†’ "Β‘Buenos dΓ­as!"
# French β†’ English
translate_text("Bonjour le monde", "fra_Latn", "eng_Latn")
# β†’ "Hello world"
# Japanese β†’ English
translate_text("こんにけは", "jpn_Jpan", "eng_Latn")
# β†’ "Hello"
```
### Production Usage
```python
import coremltools as ct
from transformers import AutoTokenizer
class Translator:
def __init__(self):
# Load once, reuse for all translations
self.encoder = ct.models.MLModel(
"NLLB_Encoder_128.mlpackage",
compute_units=ct.ComputeUnit.ALL # Use GPU
)
self.decoder = ct.models.MLModel(
"NLLB_Decoder_128.mlpackage",
compute_units=ct.ComputeUnit.ALL
)
self.tokenizer = AutoTokenizer.from_pretrained("./tokenizer")
def translate(self, text, src_lang, tgt_lang):
# Your translation logic here
pass
# Create once
translator = Translator()
# Reuse many times (fast!)
translator.translate("Hello", "eng_Latn", "deu_Latn")
translator.translate("Goodbye", "eng_Latn", "fra_Latn")
```
## 🌍 Supported Languages
See `language_codes.json` for the full list of 200+ languages. Common examples:
| Language | Code |
|----------|------|
| English | `eng_Latn` |
| German | `deu_Latn` |
| French | `fra_Latn` |
| Spanish | `spa_Latn` |
| Chinese (Simplified) | `zho_Hans` |
| Japanese | `jpn_Jpan` |
| Arabic | `arb_Arab` |
| Russian | `rus_Cyrl` |
Full list: [NLLB Language Codes](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200)
## βš™οΈ Technical Details
- **Max Tokens**: 128 (β‰ˆ80-100 words depending on language)
- **Precision**: FLOAT16
- **Compute**: CPU + GPU + Neural Engine
- **Base Model**: facebook/nllb-200-distilled-600M
## πŸ”§ Advanced Options
### CPU-Only Mode
```python
encoder = ct.models.MLModel(
"NLLB_Encoder_128.mlpackage",
compute_units=ct.ComputeUnit.CPU_ONLY
)
```
### Batch Processing
```python
texts = ["Hello", "Goodbye", "Thank you"]
translations = [translate_text(t, "eng_Latn", "deu_Latn") for t in texts]
```
## ⚠️ Limitations
- **128 token limit**: Longer text is truncated (~80-100 words)
- **Quality**: Distilled model, slightly lower quality than full NLLB-3.3B
- **Low-resource languages**: May have reduced accuracy
- **No streaming**: Complete sentence processing only
## πŸ“ License
- **Models**: CC-BY-NC-4.0 (inherited from NLLB-200)
- **Code**: MIT
⚠️ **Non-commercial use only** per NLLB license
```