Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

	@@ -8,3 +8,20 @@ GLORT2 (GLORT2 Low Rank Transformer Transformer) is a transformer model where ev
8
9	also sorry I just realized theres some residual from where I copied the model code from in my own projects that includes some "expanded lm head size" stuff just ignore that if you're looking at the config and code this isn't a serious project so I don't care too much that it's there
10

 also sorry I just realized theres some residual from where I copied the model code from in my own projects that includes some "expanded lm head size" stuff just ignore that if you're looking at the config and code this isn't a serious project so I don't care too much that it's there
+| model | 512-token strided perplexity on a pile test set | tokens |
+| --- | --- | --- |
+| cerebras 111m | 21.550655364990234 | 2.2b |
+| cerebras 256m | 15.203496932983398 | 5.1b |
+| pythia 70m | 22.393400192260742 | 300b |
+| pythia 160m | 13.933751106262207 | 300b |
+| pythia 410m | 9.61842155456543 | 300b |
+| GLORT2 (205m) | 13.051741600036621 | 2.2b |
+| custom llama w same settings as cerebras 111m | 13.882301330566406 | 2.2b |
+|    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
+|-------------|------:|------|-----:|--------|-----:|---|-----:|
+|arc_challenge|      1|none  |    25|acc     |0.1706|±  |0.0110|
+|             |       |none  |    25|acc_norm|0.2099|±  |0.0119|
+|truthfulqa_mc2|      2|none  |     0|acc   |0.4599|±  |0.0154|
+|winogrande|      1|none  |     5|acc   |0.5083|±  |0.0141|