example / README.md

README: link to TROUBLESHOOTING.md

2152c36 verified 17 days ago

2.84 kB

license: cc-by-nc-4.0
library_name: transformers
pipeline_tag: translation
base_model: facebook/nllb-200-distilled-600M
tags:
  - translation
  - nllb
  - seq2seq
  - endpoints-template
inference: true
language:
  - multilingual

baseline-nllb

A baseline clone of facebook/nllb-200-distilled-600M, packaged for Hugging Face Inference Endpoints with a custom handler so callers can pass arbitrary NLLB Flores-200 language codes at request time.

Deploying to Inference Endpoints

Open this repo on the Hub and click Deploy → Inference Endpoints.
Pick a GPU instance (the 600M model runs fine on a small GPU; a CPU instance also works but is slower).
Leave the container type as Default — the Endpoints runtime will auto-detect handler.py and install requirements.txt.
Deploy.

Request format

{
  "inputs": "Hello, world!",
  "parameters": {
    "src_lang": "eng_Latn",
    "tgt_lang": "spa_Latn",
    "max_length": 256,
    "num_beams": 4
  }
}

inputs may be a single string or a list of strings. src_lang / tgt_lang use the Flores-200 codes (e.g. eng_Latn, spa_Latn, fra_Latn, zho_Hans, arb_Arab). If omitted, the handler defaults to eng_Latn → spa_Latn.

Response

[{ "translation_text": "¡Hola, mundo!" }]

Example clients

cURL

curl https://<your-endpoint>.endpoints.huggingface.cloud \
  -H "Authorization: Bearer $HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs": "Hello, world!",
        "parameters": { "src_lang": "eng_Latn", "tgt_lang": "fra_Latn" }
      }'

Python

import requests

resp = requests.post(
    "https://<your-endpoint>.endpoints.huggingface.cloud",
    headers={"Authorization": f"Bearer {HF_TOKEN}"},
    json={
        "inputs": ["Hello, world!", "How are you?"],
        "parameters": {"src_lang": "eng_Latn", "tgt_lang": "deu_Latn"},
    },
    timeout=30,
)
print(resp.json())

Files in this repo

File	Purpose
`handler.py`	Custom `EndpointHandler` used by HF Inference Endpoints.
`requirements.txt`	Extra Python deps installed into the endpoint container.
`model_loader.py`	One-off script that pushed the base NLLB weights to this repo.
`config.json`, `tokenizer`, `.safetensors`	Model + tokenizer artifacts (pushed by `model_loader.py`).
`TROUBLESHOOTING.md`	Real deploy failures we hit and how we fixed them — read this first if the endpoint won't start.

License

Inherits CC-BY-NC-4.0 from the upstream facebook/nllb-200-distilled-600M model — non-commercial use only.