add model weights for 3 implementations of einygpt

Files changed (4) hide show

README.md CHANGED Viewed

@@ -1,3 +1,13 @@
 ---
 license: mit
 ---

 ---
 license: mit
 ---
+# einygpt
+Here's the models I've trained with the model in [einygpt](https://github.com/clankur/einygpt). For reference they are:
+- [a multihead attention model](./model_weights_mha.pth) replicating the model discussed in the [TinyStories paper](https://arxiv.org/abs/2305.07759) using the GPT2Tokenizer
+- [a multiquery attention model](model_weights_mqa.pth) using the GPT2Tokenizer
+- [a grouped query attention model with the number of groups = 4](model_weights_gqa_tt.pth) and using its own [tokenizer](https://github.com/clankur/einygpt/blob/main/tiny_tokenizer.py)
+For playing with these model, you can view how they are used [here](https://github.com/clankur/einygpt/blob/main/perplexity.ipynb)

model_weights_gqa_tt.pth ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:3abfbdd339e49a369a1c7a0176a754c281d87ca46d19f8249c6116a3b31e3312
+size 17763087

model_weights_mha.pth ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:adc57bb222d0af37f2fe187c0ef16c64de8f83383fe70e62a9269491745c9cfe
+size 28085519

model_weights_mqa.pth ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:141c0e3705e6ad5c15131acde6965ecedf50ef64ff2881efeaee88be43653fa5
+size 28429583