⚠️ DEPRECATED VERSION

Please use V8.5.3 instead.

This model has known issues and is kept for research purposes only. V8.5.3 fixes all bugs and is the current production model.

NLLB Bishnupriya Manipuri V8.4

LoRA fine-tune of facebook/nllb-200-distilled-600M for English → Bishnupriya Manipuri.

Status: Production - outputs pure BPY, not Assamese/Bengali.

Training: 2558 pairs, 400 weighted for core vocab. Val_loss ~0.85.

Quick start

from peft import PeftModel
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import torch

base = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M")
model = PeftModel.from_pretrained(base, "Emarthar/nllb-bpy-beng-v8_4")
tokenizer = AutoTokenizer.from_pretrained("Emarthar/nllb-bpy-beng-v8_4")
model.eval()

def translate(text):
    tokenizer.src_lang = "eng_Latn"
    inputs = tokenizer(text, return_tensors="pt")
    with torch.no_grad():
        out = model.generate(
            **inputs,
            forced_bos_token_id=tokenizer.convert_tokens_to_ids("asm_Beng"),
            max_new_tokens=64,
            num_beams=5
        )
    return tokenizer.batch_decode(out, skip_special_tokens=True)[0]

print(translate("Water is important")) # পানীহান দরকারি
print(translate("The sky is blue")) # হাগহান নীলুৱাহান
print(translate("My name is Arunita")) # মর নাংহান অরুনিতা

Downloads last month: 2

Model tree for Emarthar/nllb-bpy-beng-v8_4

Base model

facebook/nllb-200-distilled-600M

Adapter

(127)

this model