Azurro
/

APT2-1B-Base

Text Generation

text-generation-inference

Model card Files Files and versions

chrisociepa commited on Oct 5, 2023

Commit

35a3b38

·

1 Parent(s): 82c8a2c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -41,7 +41,7 @@ APT2-1B-Base is a base model introducing a new series of the APT2 (Azurro Pretra
 APT2-1B-Base is an autoregressive language model based on the architecture of a transformer. It has been trained with data collected before April 2023.
-30 billion tokens have been used for training, and the training dataset (the Polish corpus) has over 7 billion tokens. Chinchilla’s scaling law has been applied (20 tokens per every model parameter).
 A special tokenizer has been prepared and trained for the purpose of training the model.

 APT2-1B-Base is an autoregressive language model based on the architecture of a transformer. It has been trained with data collected before April 2023.
+30 billion tokens have been used for training, and the training dataset (the Polish corpus) has over 7 billion tokens.
 A special tokenizer has been prepared and trained for the purpose of training the model.