Instructions to use SatLlama/AI_Translator with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SatLlama/AI_Translator with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("SatLlama/AI_Translator") model = AutoModelForSeq2SeqLM.from_pretrained("SatLlama/AI_Translator") - Notebooks
- Google Colab
- Kaggle
| library_name: transformers | |
| license: cc-by-nc-4.0 | |
| --- | |
| license: cc-by-nc-4.0 | |
| tags: | |
| - translation | |
| - nllb | |
| --- | |
| # My NLLB-200 Translator | |
| This repository contains a copy of Meta's (Facebook) **NLLB-200-distilled-600M** model. It has been cloned here for custom personal access and application deployment. | |
| ### 🌟 Model Details | |
| - **Original Developer:** Meta AI (Facebook) | |
| - **Model Type:** Seq2Seq Language Model (Machine Translation) | |
| - **Model Size:** 600 Million parameters | |
| - **License:** CC-BY-NC-4.0 (Non-commercial use only) | |
| ### 🌍 Language Support | |
| This model supports direct translation between 200+ languages. For example: | |
| - English: `eng_Latn` | |
| - Telugu: `tel_Telu` | |
| - Hindi: `hin_Deva` | |
| - French: `fra_Latn` | |
| ### 🚀 How to Get Started | |
| You can use this model directly with the Hugging Face `transformers` library: | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForSeq2SeqLM | |
| # Replace with your actual repository path | |
| model_name = "YOUR_USERNAME/YOUR_REPO_NAME" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| model = AutoModelForSeq2SeqLM.from_pretrained(model_name) | |
| # Set source language | |
| tokenizer.src_lang = "eng_Latn" | |
| text = "Hello, how are you today?" | |
| inputs = tokenizer(text, return_tensors="pt") | |
| # Target translation (Example: Telugu) | |
| translated_tokens = model.generate( | |
| **inputs, | |
| forced_bos_token_id=tokenizer.convert_tokens_to_ids("tel_Telu"), | |
| max_length=50 | |
| ) | |
| output = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0] | |
| print("Translation:", output) | |
| ## Citation | |
| @article{nllbteam2022neglected, | |
| title={No Language Left Behind: Scaling Human-Centered Machine Translation}, | |
| author={NLLB Team and Marta R. Costa-jussà and James Cross and Onur Çelebi and Maha Elbayad and Kenneth Heafield and others}, | |
| journal={arXiv preprint arXiv:2207.04672}, | |
| year={2022} | |
| } |