metadata
library_name: transformers
datasets:
- oscar
- mc4
language:
- am
metrics:
- perplexity
pipeline_tag: fill-mask
widget:
- text: ከሀገራቸው ከኢትዮጵያ ከወጡ ግማሽ ምዕተ [MASK] ተቆጥሯል።
example_title: Example
bert-mini-amharic-16k
This model has the same architecture as bert-mini and was pretrained from scratch using the Amharic subsets of the oscar and mc4 datasets, on a total of 165 Million tokens.
It achieves the following results on the evaluation set:
Loss: 2.59Perplexity: 13.33
Even though this model only has 7.5 Million parameters, its perplexity score is comparable to the 36x larger 279 Million parameter xlm-roberta-base model on the same Amharic evaluation set.