rasyosef commited on
Commit
87c5d6d
·
verified ·
1 Parent(s): a4db597

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -18,7 +18,8 @@ widget:
18
 
19
  # bert-tiny-amharic
20
 
21
- This model has the same architecture as [bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) and was pretrained from scratch using the Amharic subsets of the [oscar](https://huggingface.co/datasets/oscar), [mc4](https://huggingface.co/datasets/mc4), and [amharic-sentences-corpus](https://huggingface.co/datasets/rasyosef/amharic-sentences-corpus) datasets, on a total of **290 Million tokens**. The tokenizer was trained from scratch on the same text corpus, and had a vocabulary size of 28k.
 
22
  It achieves the following results on the evaluation set:
23
  - `Loss: 4.27`
24
  - `Perplexity: 71.52`
 
18
 
19
  # bert-tiny-amharic
20
 
21
+ This model has the same architecture as [bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) and was pretrained from scratch using the Amharic subsets of the [oscar](https://huggingface.co/datasets/oscar), [mc4](https://huggingface.co/datasets/mc4), and [amharic-sentences-corpus](https://huggingface.co/datasets/rasyosef/amharic-sentences-corpus) datasets, on a total of **290 million tokens**. The tokenizer was trained from scratch on the same text corpus, and had a vocabulary size of 28k.
22
+
23
  It achieves the following results on the evaluation set:
24
  - `Loss: 4.27`
25
  - `Perplexity: 71.52`