--- license: cc-by-nc-4.0 library_name: transformers pipeline_tag: translation base_model: facebook/nllb-200-distilled-600M tags: - translation - nllb - seq2seq - endpoints-template inference: true language: - multilingual --- # baseline-nllb A baseline clone of [`facebook/nllb-200-distilled-600M`](https://huggingface.co/facebook/nllb-200-distilled-600M), packaged for **Hugging Face Inference Endpoints** with a custom handler so callers can pass arbitrary NLLB Flores-200 language codes at request time. ## Deploying to Inference Endpoints 1. Open this repo on the Hub and click **Deploy → Inference Endpoints**. 2. Pick a GPU instance (the 600M model runs fine on a small GPU; a CPU instance also works but is slower). 3. Leave the container type as **Default** — the Endpoints runtime will auto-detect [`handler.py`](./handler.py) and install [`requirements.txt`](./requirements.txt). 4. Deploy. ## Request format ```json { "inputs": "Hello, world!", "parameters": { "src_lang": "eng_Latn", "tgt_lang": "spa_Latn", "max_length": 256, "num_beams": 4 } } ``` `inputs` may be a single string or a list of strings. `src_lang` / `tgt_lang` use the [Flores-200 codes](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200) (e.g. `eng_Latn`, `spa_Latn`, `fra_Latn`, `zho_Hans`, `arb_Arab`). If omitted, the handler defaults to `eng_Latn` → `spa_Latn`. ### Response ```json [{ "translation_text": "¡Hola, mundo!" }] ``` ## Example clients ### cURL ```bash curl https://.endpoints.huggingface.cloud \ -H "Authorization: Bearer $HF_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "inputs": "Hello, world!", "parameters": { "src_lang": "eng_Latn", "tgt_lang": "fra_Latn" } }' ``` ### Python ```python import requests resp = requests.post( "https://.endpoints.huggingface.cloud", headers={"Authorization": f"Bearer {HF_TOKEN}"}, json={ "inputs": ["Hello, world!", "How are you?"], "parameters": {"src_lang": "eng_Latn", "tgt_lang": "deu_Latn"}, }, timeout=30, ) print(resp.json()) ``` ## Files in this repo | File | Purpose | | --- | --- | | `handler.py` | Custom `EndpointHandler` used by HF Inference Endpoints. | | `requirements.txt` | Extra Python deps installed into the endpoint container. | | `model_loader.py` | One-off script that pushed the base NLLB weights to this repo. | | `config.json`, `tokenizer*`, `*.safetensors` | Model + tokenizer artifacts (pushed by `model_loader.py`). | | `TROUBLESHOOTING.md` | Real deploy failures we hit and how we fixed them — read this first if the endpoint won't start. | ## License Inherits `CC-BY-NC-4.0` from the upstream `facebook/nllb-200-distilled-600M` model — **non-commercial use only**.