Update README.md
Browse files
README.md
CHANGED
|
@@ -117,7 +117,7 @@ The model, NT-Java-1.1B, has been trained on publicly available datasets and com
|
|
| 117 |
|
| 118 |
## Model
|
| 119 |
|
| 120 |
-
- **Architecture:** GPT-2 model with
|
| 121 |
- **•Fine-training steps:** 50k
|
| 122 |
- **Pretraining tokens:** 22 Billion
|
| 123 |
- **Precision:** bfloat16
|
|
|
|
| 117 |
|
| 118 |
## Model
|
| 119 |
|
| 120 |
+
- **Architecture:** GPT-2 model with Multi-Query Attention and Fill-in-the-Middle objective
|
| 121 |
- **•Fine-training steps:** 50k
|
| 122 |
- **Pretraining tokens:** 22 Billion
|
| 123 |
- **Precision:** bfloat16
|