Tangkhul-English Translator

Model ID: Jetherruns/Tangkhul-English-Translator

This is a LoRA adapter fine-tuned on top of facebook/nllb-200-distilled-600M for translation between English and Tangkhul (Naga), with a strong focus on the religious domain (Bible, hymns, sermons, prayers, Christian literature, etc.). This is still in beta version.


Model Description

  • Developed by: Jetherruns
  • Base Model: facebook/nllb-200-distilled-600M
  • Architecture: NLLB-200 (Distilled 600M) + LoRA adapter
  • Languages: English (eng_Latn) โ†” Tangkhul (nmf_Latn)
  • Primary Domain: Religious / Christian texts
  • License: CC BY-NC 4.0
    โ†’ Free for non-commercial use. Commercial use requires explicit permission.

This model was specifically fine-tuned to improve translation quality for religious content, where literal accuracy, spiritual tone, and proper terminology are critical.


Intended Uses

  • Translation of Bible verses, sermons, Christian songs, prayers, and religious literature
  • Assisting Tangkhul-speaking communities with English religious materials
  • Research in low-resource language translation, especially in the religious domain
  • Building translation tools, mobile apps, or web interfaces (non-commercial)

Note: This model is optimized for religious text. Performance on general-domain or casual conversation may vary.


How to Use

1. Using with PEFT (Recommended)

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
from peft import PeftModel
import torch

model_id = "Jetherruns/Tangkhul-English-Translator"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(
    "facebook/nllb-200-distilled-600M",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(model, model_id)
model.eval()

# Translate English to Tangkhul
text = "For God so loved the world that he gave his one and only Son."
tokenizer.src_lang = "eng_Latn"
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    forced_bos_token_id=tokenizer.lang_code_to_id["nmf_Latn"],
    max_length=256,
    num_beams=4,
    early_stopping=True
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Jetherruns/Tangkhul-English-Translator

Adapter
(59)
this model

Space using Jetherruns/Tangkhul-English-Translator 1