yonas
/

AmhT5-tokenizer

Model card Files Files and versions

yonas commited on Jul 5, 2024

Commit

e2a88ea

·

verified ·

1 Parent(s): 22ab9b6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ An MT5Tokenizer based Amharic and English tokenizer trained using [Fineweb](http
 This tokenizer aims to have a tokenizer that can better represent Amharic while also doing the same for English.
 To balance the dataset, I have used only 3 million document samples from the dataset. The vocabulary size of this tokenizer is the same as `google/mt5-small`.
-### Mt% Tokenizer Vs AmhT5 Tokenizer
 ```python
 from transformers import MT5TokenizerFast

 This tokenizer aims to have a tokenizer that can better represent Amharic while also doing the same for English.
 To balance the dataset, I have used only 3 million document samples from the dataset. The vocabulary size of this tokenizer is the same as `google/mt5-small`.
+### MT5 Tokenizer Vs AmhT5 Tokenizer
 ```python
 from transformers import MT5TokenizerFast