Simitech AML AfriNLLB Translator

Fine-tuned from facebook/nllb-200-distilled-600M on East African AML transaction narratives. Specialized for translating Luganda (lug_Latn) and Swahili (swh_Latn) mobile money transaction descriptions to English for downstream AML classification.

Why a specialized translator?

General NLLB models miss domain-specific AML vocabulary:

  • Mobile money agent terminology (float, airtime, USSD codes)
  • Ugandan colloquialisms used in social engineering scams
  • Financial crime typology phrases specific to EAC corridor

Usage

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_id = "darthvader256/simitech-aml-afrinllb-translator"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

tokenizer.src_lang = "lug_Latn"
inputs = tokenizer("nkusaba ssente z'omusawo omukisa", return_tensors="pt")
output = model.generate(
    **inputs,
    forced_bos_token_id=tokenizer.lang_code_to_id["eng_Latn"],
    max_new_tokens=128,
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
# โ†’ "I am asking for doctor money, please"

Source

decision-plane/app/training/nlp_finetune.py โ€” AfriNLLBTranslator class

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for darthvader256/simitech-aml-afrinllb-translator

Finetuned
(283)
this model