Update README.md
Browse files
README.md
CHANGED
|
@@ -48,24 +48,23 @@ This training run is monolingual and uses c4en and english wikipedia datasets.
|
|
| 48 |
|
| 49 |
## Test results
|
| 50 |
|
| 51 |
-
These are the results from [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) at
|
| 52 |
|
| 53 |
| Task |Version| Metric | Value | |Stderr|
|
| 54 |
|--------------|------:|--------|------:|---|-----:|
|
| 55 |
-
|anli_r1 | 0|acc | 0.
|
| 56 |
-
|anli_r2 | 0|acc | 0.
|
| 57 |
-
|anli_r3 | 0|acc | 0.
|
| 58 |
-
|hellaswag | 0|acc | 0.
|
| 59 |
-
| | |acc_norm| 0.
|
| 60 |
-
|lambada_openai| 0|ppl |
|
| 61 |
-
| | |acc | 0.
|
| 62 |
-
|mathqa | 0|acc | 0.
|
| 63 |
-
| | |acc_norm| 0.
|
| 64 |
-
|piqa | 0|acc | 0.
|
| 65 |
-
| | |acc_norm| 0.
|
| 66 |
-
|winogrande | 0|acc | 0.
|
| 67 |
-
|wsc | 0|acc | 0.
|
| 68 |
-
|
| 69 |
|
| 70 |
## Installation
|
| 71 |
|
|
|
|
| 48 |
|
| 49 |
## Test results
|
| 50 |
|
| 51 |
+
These are the results from [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) at 81B (tokens trained) checkpoint.
|
| 52 |
|
| 53 |
| Task |Version| Metric | Value | |Stderr|
|
| 54 |
|--------------|------:|--------|------:|---|-----:|
|
| 55 |
+
|anli_r1 | 0|acc | 0.3260|± |0.0148|
|
| 56 |
+
|anli_r2 | 0|acc | 0.3380|± |0.0150|
|
| 57 |
+
|anli_r3 | 0|acc | 0.3583|± |0.0138|
|
| 58 |
+
|hellaswag | 0|acc | 0.4666|± |0.0050|
|
| 59 |
+
| | |acc_norm| 0.6157|± |0.0049|
|
| 60 |
+
|lambada_openai| 0|ppl |10.0153|± |0.3145|
|
| 61 |
+
| | |acc | 0.5403|± |0.0069|
|
| 62 |
+
|mathqa | 0|acc | 0.2332|± |0.0077|
|
| 63 |
+
| | |acc_norm| 0.2348|± |0.0078|
|
| 64 |
+
|piqa | 0|acc | 0.7503|± |0.0101|
|
| 65 |
+
| | |acc_norm| 0.7503|± |0.0101|
|
| 66 |
+
|winogrande | 0|acc | 0.5872|± |0.0138|
|
| 67 |
+
|wsc | 0|acc | 0.5673|± |0.0488|
|
|
|
|
| 68 |
|
| 69 |
## Installation
|
| 70 |
|