Update README.md
Browse files
README.md
CHANGED
|
@@ -25,7 +25,7 @@ An MT5Tokenizer based Amharic and English tokenizer trained using [Fineweb](http
|
|
| 25 |
This tokenizer aims to have a tokenizer that can better represent Amharic while also doing the same for English.
|
| 26 |
To balance the dataset, I have used only 3 million document samples from the dataset. The vocabulary size of this tokenizer is the same as `google/mt5-small`.
|
| 27 |
|
| 28 |
-
###
|
| 29 |
|
| 30 |
```python
|
| 31 |
from transformers import MT5TokenizerFast
|
|
|
|
| 25 |
This tokenizer aims to have a tokenizer that can better represent Amharic while also doing the same for English.
|
| 26 |
To balance the dataset, I have used only 3 million document samples from the dataset. The vocabulary size of this tokenizer is the same as `google/mt5-small`.
|
| 27 |
|
| 28 |
+
### MT5 Tokenizer Vs AmhT5 Tokenizer
|
| 29 |
|
| 30 |
```python
|
| 31 |
from transformers import MT5TokenizerFast
|