Update README.md
Browse files
README.md
CHANGED
|
@@ -22,6 +22,10 @@ This is a merge of pre-trained language models created using [mergekit](https://
|
|
| 22 |
|
| 23 |
Initial private run of openllm benchmark:
|
| 24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|
| 26 |
|-------------|------:|------|-----:|--------|---|-----:|---|-----:|
|
| 27 |
|arc_challenge| 1|none | 25|acc |↑ |0.6527|± |0.0139|
|
|
|
|
| 22 |
|
| 23 |
Initial private run of openllm benchmark:
|
| 24 |
|
| 25 |
+
Cut it off before the GSM8K, but it definitely looks like the eqbench is an outlier on general capabilities.
|
| 26 |
+
|
| 27 |
+
(Not that it isn't still decently smart.)
|
| 28 |
+
|
| 29 |
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|
| 30 |
|-------------|------:|------|-----:|--------|---|-----:|---|-----:|
|
| 31 |
|arc_challenge| 1|none | 25|acc |↑ |0.6527|± |0.0139|
|