Update README.md
Browse files
README.md
CHANGED
|
@@ -45,4 +45,4 @@ FineWeb, 187M params: 3.73 val loss / 41.6 PPL (75k steps)
|
|
| 45 |
Architecture: 21 layers, 768d, 12 heads, 16 slots
|
| 46 |
Links
|
| 47 |
Code: https://github.com/DigitalDaimyo/AddressedStateAttention
|
| 48 |
-
Paper: https://github.com/DigitalDaimyo/AddressedStateAttention/
|
|
|
|
| 45 |
Architecture: 21 layers, 768d, 12 heads, 16 slots
|
| 46 |
Links
|
| 47 |
Code: https://github.com/DigitalDaimyo/AddressedStateAttention
|
| 48 |
+
Paper: https://github.com/DigitalDaimyo/AddressedStateAttention/paper_drafts
|