Update README.md
Browse files
README.md
CHANGED
|
@@ -121,7 +121,7 @@ print(tokenizer.decode(outputs[0]))
|
|
| 121 |
## Model
|
| 122 |
|
| 123 |
- **Architecture:** GPT-2 model with Multi-Query Attention and Fill-in-the-Middle objective.
|
| 124 |
-
- **
|
| 125 |
- **Context length:** 8K tokens
|
| 126 |
- **Pretraining tokens:** 22 billion
|
| 127 |
- **Precision:** bfloat16
|
|
|
|
| 121 |
## Model
|
| 122 |
|
| 123 |
- **Architecture:** GPT-2 model with Multi-Query Attention and Fill-in-the-Middle objective.
|
| 124 |
+
- **Training steps:** 120K
|
| 125 |
- **Context length:** 8K tokens
|
| 126 |
- **Pretraining tokens:** 22 billion
|
| 127 |
- **Precision:** bfloat16
|