akera's picture
Training in progress, step 2000
5a2f555 verified
metadata
language:
  - en
  - lg
  - ach
license: mit
tags:
  - xlm-roberta
  - ugandan-languages
  - multilingual
  - masked-language-model
datasets:
  - Sunbird/ug40
  - Sunbird/external-translation-data
library_name: transformers
pipeline_tag: fill-mask

XLM-RoBERTa Fine-tuned on Ugandan Languages

This model is XLM-RoBERTa-base fine-tuned on a comprehensive dataset of Ugandan languages.

Usage

from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline

tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-uganda-languages")
model = AutoModelForMaskedLM.from_pretrained("xlm-roberta-uganda-languages")

fill_mask = pipeline("fill-mask", model=model, tokenizer=tokenizer)
result = fill_mask("Abantu b'omubyalo tibatera kwikiriza [MASK] muyaaka.")
print(result)

Training Details

  • Training Steps: N/A
  • Training Loss: 2.1567
  • Learning Rate: 5e-05
  • Batch Size: 8
  • Epochs: 3