example / README.md
ericaRC's picture
README: link to TROUBLESHOOTING.md
2152c36 verified
---
license: cc-by-nc-4.0
library_name: transformers
pipeline_tag: translation
base_model: facebook/nllb-200-distilled-600M
tags:
- translation
- nllb
- seq2seq
- endpoints-template
inference: true
language:
- multilingual
---
# baseline-nllb
A baseline clone of [`facebook/nllb-200-distilled-600M`](https://huggingface.co/facebook/nllb-200-distilled-600M), packaged for **Hugging Face Inference Endpoints** with a custom handler so callers can pass arbitrary NLLB Flores-200 language codes at request time.
## Deploying to Inference Endpoints
1. Open this repo on the Hub and click **Deploy → Inference Endpoints**.
2. Pick a GPU instance (the 600M model runs fine on a small GPU; a CPU instance also works but is slower).
3. Leave the container type as **Default** — the Endpoints runtime will auto-detect [`handler.py`](./handler.py) and install [`requirements.txt`](./requirements.txt).
4. Deploy.
## Request format
```json
{
"inputs": "Hello, world!",
"parameters": {
"src_lang": "eng_Latn",
"tgt_lang": "spa_Latn",
"max_length": 256,
"num_beams": 4
}
}
```
`inputs` may be a single string or a list of strings. `src_lang` / `tgt_lang` use the [Flores-200 codes](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200) (e.g. `eng_Latn`, `spa_Latn`, `fra_Latn`, `zho_Hans`, `arb_Arab`). If omitted, the handler defaults to `eng_Latn``spa_Latn`.
### Response
```json
[{ "translation_text": "¡Hola, mundo!" }]
```
## Example clients
### cURL
```bash
curl https://<your-endpoint>.endpoints.huggingface.cloud \
-H "Authorization: Bearer $HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": "Hello, world!",
"parameters": { "src_lang": "eng_Latn", "tgt_lang": "fra_Latn" }
}'
```
### Python
```python
import requests
resp = requests.post(
"https://<your-endpoint>.endpoints.huggingface.cloud",
headers={"Authorization": f"Bearer {HF_TOKEN}"},
json={
"inputs": ["Hello, world!", "How are you?"],
"parameters": {"src_lang": "eng_Latn", "tgt_lang": "deu_Latn"},
},
timeout=30,
)
print(resp.json())
```
## Files in this repo
| File | Purpose |
| --- | --- |
| `handler.py` | Custom `EndpointHandler` used by HF Inference Endpoints. |
| `requirements.txt` | Extra Python deps installed into the endpoint container. |
| `model_loader.py` | One-off script that pushed the base NLLB weights to this repo. |
| `config.json`, `tokenizer*`, `*.safetensors` | Model + tokenizer artifacts (pushed by `model_loader.py`). |
| `TROUBLESHOOTING.md` | Real deploy failures we hit and how we fixed them — read this first if the endpoint won't start. |
## License
Inherits `CC-BY-NC-4.0` from the upstream `facebook/nllb-200-distilled-600M` model — **non-commercial use only**.