--- library_name: transformers license: cc-by-nc-4.0 --- --- license: cc-by-nc-4.0 tags: - translation - nllb --- # My NLLB-200 Translator This repository contains a copy of Meta's (Facebook) **NLLB-200-distilled-600M** model. It has been cloned here for custom personal access and application deployment. ### 🌟 Model Details - **Original Developer:** Meta AI (Facebook) - **Model Type:** Seq2Seq Language Model (Machine Translation) - **Model Size:** 600 Million parameters - **License:** CC-BY-NC-4.0 (Non-commercial use only) ### 🌍 Language Support This model supports direct translation between 200+ languages. For example: - English: `eng_Latn` - Telugu: `tel_Telu` - Hindi: `hin_Deva` - French: `fra_Latn` ### 🚀 How to Get Started You can use this model directly with the Hugging Face `transformers` library: ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM # Replace with your actual repository path model_name = "YOUR_USERNAME/YOUR_REPO_NAME" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSeq2SeqLM.from_pretrained(model_name) # Set source language tokenizer.src_lang = "eng_Latn" text = "Hello, how are you today?" inputs = tokenizer(text, return_tensors="pt") # Target translation (Example: Telugu) translated_tokens = model.generate( **inputs, forced_bos_token_id=tokenizer.convert_tokens_to_ids("tel_Telu"), max_length=50 ) output = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0] print("Translation:", output) ## Citation @article{nllbteam2022neglected, title={No Language Left Behind: Scaling Human-Centered Machine Translation}, author={NLLB Team and Marta R. Costa-jussà and James Cross and Onur Çelebi and Maha Elbayad and Kenneth Heafield and others}, journal={arXiv preprint arXiv:2207.04672}, year={2022} }