Translation
Transformers
Safetensors
multilingual
m2m_100
text2text-generation
nllb
seq2seq
endpoints-template
Instructions to use Resilient-Coders/baseline-nllb with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Resilient-Coders/baseline-nllb with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="Resilient-Coders/baseline-nllb")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Resilient-Coders/baseline-nllb") model = AutoModelForSeq2SeqLM.from_pretrained("Resilient-Coders/baseline-nllb") - Notebooks
- Google Colab
- Kaggle
| license: cc-by-nc-4.0 | |
| library_name: transformers | |
| pipeline_tag: translation | |
| base_model: facebook/nllb-200-distilled-600M | |
| tags: | |
| - translation | |
| - nllb | |
| - seq2seq | |
| - endpoints-template | |
| inference: true | |
| language: | |
| - multilingual | |
| # baseline-nllb | |
| A baseline clone of [`facebook/nllb-200-distilled-600M`](https://huggingface.co/facebook/nllb-200-distilled-600M), packaged for **Hugging Face Inference Endpoints** with a custom handler so callers can pass arbitrary NLLB Flores-200 language codes at request time. | |
| ## Deploying to Inference Endpoints | |
| 1. Open this repo on the Hub and click **Deploy → Inference Endpoints**. | |
| 2. Pick a GPU instance (the 600M model runs fine on a small GPU; a CPU instance also works but is slower). | |
| 3. Leave the container type as **Default** — the Endpoints runtime will auto-detect [`handler.py`](./handler.py) and install [`requirements.txt`](./requirements.txt). | |
| 4. Deploy. | |
| ## Request format | |
| ```json | |
| { | |
| "inputs": "Hello, world!", | |
| "parameters": { | |
| "src_lang": "eng_Latn", | |
| "tgt_lang": "spa_Latn", | |
| "max_length": 256, | |
| "num_beams": 4 | |
| } | |
| } | |
| ``` | |
| `inputs` may be a single string or a list of strings. `src_lang` / `tgt_lang` use the [Flores-200 codes](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200) (e.g. `eng_Latn`, `spa_Latn`, `fra_Latn`, `zho_Hans`, `arb_Arab`). If omitted, the handler defaults to `eng_Latn` → `spa_Latn`. | |
| ### Response | |
| ```json | |
| [{ "translation_text": "¡Hola, mundo!" }] | |
| ``` | |
| ## Example clients | |
| ### cURL | |
| ```bash | |
| curl https://<your-endpoint>.endpoints.huggingface.cloud \ | |
| -H "Authorization: Bearer $HF_TOKEN" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "inputs": "Hello, world!", | |
| "parameters": { "src_lang": "eng_Latn", "tgt_lang": "fra_Latn" } | |
| }' | |
| ``` | |
| ### Python | |
| ```python | |
| import requests | |
| resp = requests.post( | |
| "https://<your-endpoint>.endpoints.huggingface.cloud", | |
| headers={"Authorization": f"Bearer {HF_TOKEN}"}, | |
| json={ | |
| "inputs": ["Hello, world!", "How are you?"], | |
| "parameters": {"src_lang": "eng_Latn", "tgt_lang": "deu_Latn"}, | |
| }, | |
| timeout=30, | |
| ) | |
| print(resp.json()) | |
| ``` | |
| ## Files in this repo | |
| | File | Purpose | | |
| | --- | --- | | |
| | `handler.py` | Custom `EndpointHandler` used by HF Inference Endpoints. | | |
| | `requirements.txt` | Extra Python deps installed into the endpoint container. | | |
| | `model_loader.py` | One-off script that pushed the base NLLB weights to this repo. | | |
| | `config.json`, `tokenizer*`, `*.safetensors` | Model + tokenizer artifacts (pushed by `model_loader.py`). | | |
| | `TROUBLESHOOTING.md` | Real deploy failures we hit and how we fixed them — read this first if the endpoint won't start. | | |
| ## License | |
| Inherits `CC-BY-NC-4.0` from the upstream `facebook/nllb-200-distilled-600M` model — **non-commercial use only**. | |