rasyosef
/

bert-tiny-amharic

Model card Files Files and versions

rasyosef commited on Jun 16, 2024

Commit

e6f734e

·

verified ·

1 Parent(s): c1bb127

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ widget:
   example_title: Example 2
 ---
-# bert-mini-amharic
 This model has the same architecture as [bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) and was pretrained from scratch using the Amharic subsets of the [oscar](https://huggingface.co/datasets/oscar), [mc4](https://huggingface.co/datasets/mc4), and [amharic-sentences-corpus](https://huggingface.co/datasets/rasyosef/amharic-sentences-corpus) datasets, on a total of `290 Million` tokens. The tokenizer was trained from scratch on the same text corpus, and had a vocabulary size of 28k.
 It achieves the following results on the evaluation set:
@@ -24,6 +24,7 @@ It achieves the following results on the evaluation set:
 - `Perplexity: 71.52`
 This model has just `4.18M` parameters.
 # How to use
 You can use this model directly with a pipeline for masked language modeling:

   example_title: Example 2
 ---
+# bert-tiny-amharic
 This model has the same architecture as [bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) and was pretrained from scratch using the Amharic subsets of the [oscar](https://huggingface.co/datasets/oscar), [mc4](https://huggingface.co/datasets/mc4), and [amharic-sentences-corpus](https://huggingface.co/datasets/rasyosef/amharic-sentences-corpus) datasets, on a total of `290 Million` tokens. The tokenizer was trained from scratch on the same text corpus, and had a vocabulary size of 28k.
 It achieves the following results on the evaluation set:
 - `Perplexity: 71.52`
 This model has just `4.18M` parameters.
 # How to use
 You can use this model directly with a pipeline for masked language modeling: