metadata
language:
- en
- lg
- ach
license: mit
tags:
- xlm-roberta
- ugandan-languages
- multilingual
- masked-language-model
datasets:
- Sunbird/ug40
- Sunbird/external-translation-data
library_name: transformers
pipeline_tag: fill-mask
XLM-RoBERTa Fine-tuned on Ugandan Languages
This model is XLM-RoBERTa-base fine-tuned on a comprehensive dataset of Ugandan languages.
Usage
from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline
tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-uganda-languages")
model = AutoModelForMaskedLM.from_pretrained("xlm-roberta-uganda-languages")
fill_mask = pipeline("fill-mask", model=model, tokenizer=tokenizer)
result = fill_mask("Abantu b'omubyalo tibatera kwikiriza [MASK] muyaaka.")
print(result)
Training Details
- Training Steps: N/A
- Training Loss: 2.1567
- Learning Rate: 5e-05
- Batch Size: 8
- Epochs: 3