kgrabko commited on
Commit
242b971
·
verified ·
1 Parent(s): 8a3271f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -6,6 +6,8 @@ My smaller GPT-2 models utilize LayerNorm and FFN layers, whereas for larger mod
6
  I have replaced these components with RMSNorm and SwiGLU. This adjustment allows a smoother transition to large model architectures,
7
  including models with 8B, 33B, 70B, and 120B parameters.
8
 
 
 
9
  Transformer block is not frozen that give more power to tune model from scratch
10
 
11
  My GPT-2 Archtecure similar classic GPT-2 transfomer
 
6
  I have replaced these components with RMSNorm and SwiGLU. This adjustment allows a smoother transition to large model architectures,
7
  including models with 8B, 33B, 70B, and 120B parameters.
8
 
9
+ So please GPT-2 huggingface tokenizer for english and for multi languages bert tokenizer from huggingface library .
10
+
11
  Transformer block is not frozen that give more power to tune model from scratch
12
 
13
  My GPT-2 Archtecure similar classic GPT-2 transfomer