Update README.md
Browse files
README.md
CHANGED
|
@@ -18,7 +18,8 @@ widget:
|
|
| 18 |
|
| 19 |
# bert-tiny-amharic
|
| 20 |
|
| 21 |
-
This model has the same architecture as [bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) and was pretrained from scratch using the Amharic subsets of the [oscar](https://huggingface.co/datasets/oscar), [mc4](https://huggingface.co/datasets/mc4), and [amharic-sentences-corpus](https://huggingface.co/datasets/rasyosef/amharic-sentences-corpus) datasets, on a total of **290
|
|
|
|
| 22 |
It achieves the following results on the evaluation set:
|
| 23 |
- `Loss: 4.27`
|
| 24 |
- `Perplexity: 71.52`
|
|
|
|
| 18 |
|
| 19 |
# bert-tiny-amharic
|
| 20 |
|
| 21 |
+
This model has the same architecture as [bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) and was pretrained from scratch using the Amharic subsets of the [oscar](https://huggingface.co/datasets/oscar), [mc4](https://huggingface.co/datasets/mc4), and [amharic-sentences-corpus](https://huggingface.co/datasets/rasyosef/amharic-sentences-corpus) datasets, on a total of **290 million tokens**. The tokenizer was trained from scratch on the same text corpus, and had a vocabulary size of 28k.
|
| 22 |
+
|
| 23 |
It achieves the following results on the evaluation set:
|
| 24 |
- `Loss: 4.27`
|
| 25 |
- `Perplexity: 71.52`
|