BUT-FIT
/

csmpt7b

Text Generation

text-generation-inference

Model card Files Files and versions

mfajcik commited on Mar 12, 2024

Commit

50b0998

·

verified ·

1 Parent(s): e6568ef

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -56,7 +56,7 @@ To transfer knowledge from English model to Czech, we developed a simple method
 Figure 4: Ablation: Test perplexity over the course of training for vocabulary swap method on TinyLLAMA. Our method (green curve) vs TinyLLAMA training from scratch (blue curve).
 The vocabulary swap was done the same way as our [Czech-GPT-2](https://huggingface.co/BUT-FIT/Czech-GPT-2-XL-133k) model (check it out for comprehensive description.)
-We managed to align 4,177 english tokens with corresponding czech tokens.
 ## Hyperparameters
 Not mentioned hyperparameters were kept the same as for MPT.

 Figure 4: Ablation: Test perplexity over the course of training for vocabulary swap method on TinyLLAMA. Our method (green curve) vs TinyLLAMA training from scratch (blue curve).
 The vocabulary swap was done the same way as our [Czech-GPT-2](https://huggingface.co/BUT-FIT/Czech-GPT-2-XL-133k) model (check it out for comprehensive description.)
+For CSMPT7b, we managed to align 4,177 english tokens with corresponding czech tokens.
 ## Hyperparameters
 Not mentioned hyperparameters were kept the same as for MPT.