Update README.md
Browse files
README.md
CHANGED
|
@@ -7,4 +7,6 @@ language:
|
|
| 7 |
---
|
| 8 |
|
| 9 |
|
| 10 |
-
This checkpoint of the 1.3B GLA model used in the paper [Gated Linear Attention](https://arxiv.org/abs/2312.06635).
|
|
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
|
| 9 |
|
| 10 |
+
This checkpoint of the 1.3B GLA model used in the paper [Gated Linear Attention](https://arxiv.org/abs/2312.06635). The model is trained with 100B tokens from the SlimPajama dataset tokenized with Llama2 tokenizer.
|
| 11 |
+
|
| 12 |
+
See the model and loading script in this [repo](https://github.com/berlino/gated_linear_attention).
|