Commit
·
810352b
1
Parent(s):
2f9c5e7
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,20 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
---
|
| 4 |
+
## The T5 base model for the Czech Language
|
| 5 |
+
This is the t5 base model for the Czech language that is based on the smaller version of the google/mt5-base model (https://huggingface.co/google/mt5-base).
|
| 6 |
+
To make this model, I retained only the Czech and some of the English embeddings from the original multilingual model.
|
| 7 |
+
# Modifications to the original multilingual t5 base model:
|
| 8 |
+
1- Parameters of the original model were reduced from 582M to 244M parameters.
|
| 9 |
+
|
| 10 |
+
2- By choosing the top 20K Czech and 10K English tokens, sentencepiece vocabulary was shrinked from 250K to 30K tokens.
|
| 11 |
+
|
| 12 |
+
3- The original size was reduced from 2.2GB to 0.9GB.
|
| 13 |
+
|
| 14 |
+
Notes:
|
| 15 |
+
|
| 16 |
+
Since this is the base t5 model of the Czech language, before using it for any downstream tasks, it needs to be finetuned with appropriate datasets in the first place.
|
| 17 |
+
|
| 18 |
+
References:
|
| 19 |
+
|
| 20 |
+
The substantial amount of this work to create this model is mostly based on the the post written by David Dale: "How to adapt a multilingual T5 model for a single language" (https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90)
|