azizbarank
/

cst5-base

text2text-generation

text-generation-inference

Model card Files Files and versions

azizbarank commited on Apr 23, 2022

Commit

810352b

·

1 Parent(s): 2f9c5e7

Update README.md

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -1,3 +1,20 @@
 ---
 license: mit
 ---

 ---
 license: mit
 ---
+## The T5 base model for the Czech Language
+This is the t5 base model for the Czech language that is based on the smaller version of the google/mt5-base model (https://huggingface.co/google/mt5-base).
+To make this model, I retained only the Czech and some of the English embeddings from the original multilingual model.
+# Modifications to the original multilingual t5 base model:
+1- Parameters of the original model were reduced from 582M to 244M parameters.
+2- By choosing the top 20K Czech and 10K English tokens, sentencepiece vocabulary was shrinked from 250K to 30K tokens.
+3- The original size was reduced from 2.2GB to 0.9GB.
+Notes:
+Since this is the base t5 model of the Czech language, before using it for any downstream tasks, it needs to be finetuned with appropriate datasets in the first place.
+References:
+The substantial amount of this work to create this model is mostly based on the the post written by David Dale: "How to adapt a multilingual T5 model for a single language" (https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90)