Update README.md
Browse files
README.md
CHANGED
|
@@ -12,7 +12,7 @@ tags:
|
|
| 12 |
|
| 13 |

|
| 14 |
|
| 15 |
-
The original model, LLaMA 1 was pre-trained at a sequence length of 2048 tokens. We went through two individual runs, targeting a sequence length of 16,
|
| 16 |
significant increase over the original length. While it was originally pre-trained on 1.4T tokens, it was shown to respond positively to our 500M token train and will
|
| 17 |
coherently write and keep the same writing format (granted some caveats) up to 12K tokens relatively consistently.
|
| 18 |
|
|
|
|
| 12 |
|
| 13 |

|
| 14 |
|
| 15 |
+
The original model, LLaMA 1 was pre-trained at a sequence length of 2048 tokens. We went through two individual runs, targeting a sequence length of 16,384 which is a
|
| 16 |
significant increase over the original length. While it was originally pre-trained on 1.4T tokens, it was shown to respond positively to our 500M token train and will
|
| 17 |
coherently write and keep the same writing format (granted some caveats) up to 12K tokens relatively consistently.
|
| 18 |
|