Thishyaketh commited on
Commit
984d01f
·
verified ·
1 Parent(s): 987cdf7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -23,7 +23,7 @@ While using with transformers you can only use the 15M variant for now.
23
 
24
  NGen 2 is an advanced Transformer model training pipeline that supports multiple model variants. It ranges from a **nano** variant (approximately 120M parameters) to a **foundational** variant (approximately 1B parameters). The pipeline incorporates modern architectural improvements such as rotary positional embeddings, RMSNorm, and GEGLU activations to boost performance and training efficiency.
25
 
26
- > **Note:** Although NGen 3 is designed to train a 1B-parameter model, its advanced architecture pushes its performance closer to that of much larger models.
27
 
28
 
29
 
 
23
 
24
  NGen 2 is an advanced Transformer model training pipeline that supports multiple model variants. It ranges from a **nano** variant (approximately 120M parameters) to a **foundational** variant (approximately 1B parameters). The pipeline incorporates modern architectural improvements such as rotary positional embeddings, RMSNorm, and GEGLU activations to boost performance and training efficiency.
25
 
26
+ > **Note:** Although NGen 2 is designed to train a 1B-parameter model, its advanced architecture pushes its performance closer to that of much larger models. Try using NGen3 for performance.
27
 
28
 
29