YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
English-Nepali Translation Model
Overview
Fine-tuned translation model for converting English text to Nepali (and vice versa) using Facebook's NLLB-200 distilled model.
Model Details
| Property | Value |
|---|---|
| Model ID | Saugat212/ne-en-nllb-model |
| Base Model | facebook/nllb-200-distilled-600M |
| Architecture | m2m_100 |
| Parameters | 0.6B |
| License | apache-2.0 |
Purpose
- Translate English text to Nepali (EN→NE)
- Translate Nepali text to English (NE→EN)
- Domain-specific translation using custom fine-tuned weights
Contents
| File | Description |
|---|---|
Fine_Tuning.ipynb |
NLLB fine-tuning notebook |
Fine_Tuning_nllb.ipynb |
NLLB-specific fine-tuning |
transformer_finetuning.ipynb |
Alternative transformer fine-tuning |
data_clean.ipynb |
Data cleaning notebook |
Data Fetching from translator.ipynb |
Fetching parallel data |
inference.ipynb |
Translation inference notebook |
opus-translation.py |
OPUS-based translation |
Finetune.md |
Quick setup guide |
NLLB_Finetuning_Documentation.md |
Detailed NLLB docs |
Usage
Load Model
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "Saugat212/ne-en-nllb-model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
Translate EN→NE
def translate_en_to_ne(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
out = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["ne_Latn"], max_new_tokens=128)
return tokenizer.decode(out[0], skip_special_tokens=True)
print(translate_en_to_ne("Hello, how are you?"))
Translate NE→EN
def translate_ne_to_en(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
out = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["en_Latn"], max_new_tokens=128)
return tokenizer.decode(out[0], skip_special_tokens=True)
print(translate_ne_to_en("नमस्ते, तपाईं कस्तो हुनुहुन्छ?"))
Requirements
- transformers
- torch
- pandas
- datasets
Fine-tuning
To fine-tune on custom data:
- Prepare CSV with
English_SentenceandNepali_Translationcolumns - Use
Fine_Tuning.ipynborFinetune.mdas reference - Adjust hyperparameters based on GPU memory
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support