Update README.md
Browse files
README.md
CHANGED
|
@@ -26,7 +26,7 @@ No copyright texts are used in the training of this model without the permission
|
|
| 26 |
## Training procedure
|
| 27 |
This model was trained for 402 billion tokens over 383,500 steps on TPU v3-256 pod. It was trained as an autoregressive language model, using cross-entropy loss to maximize the likelihood of predicting the next token correctly.
|
| 28 |
|
| 29 |
-
##
|
| 30 |
|
| 31 |
This model can be easily loaded using the `AutoModelForCausalLM` functionality:
|
| 32 |
|
|
|
|
| 26 |
## Training procedure
|
| 27 |
This model was trained for 402 billion tokens over 383,500 steps on TPU v3-256 pod. It was trained as an autoregressive language model, using cross-entropy loss to maximize the likelihood of predicting the next token correctly.
|
| 28 |
|
| 29 |
+
## How to use
|
| 30 |
|
| 31 |
This model can be easily loaded using the `AutoModelForCausalLM` functionality:
|
| 32 |
|