Update README.md
Browse files
README.md
CHANGED
|
@@ -62,6 +62,7 @@ Average: 75.9% without mmlu
|
|
| 62 |
| | |mc2 |77.90|± | 1.37|
|
| 63 |
|
| 64 |
### BigBench Reasoning Test
|
|
|
|
| 65 |
| Task | Version | Metric | Value | | Stderr|
|
| 66 |
|------------------------------------------------|---------|-----------------------|--------------|---|-------|
|
| 67 |
| bigbench_causal_judgement | 0| multiple_choice_grade | 0.6000 | _ | 0.0356 |
|
|
@@ -84,8 +85,10 @@ Average: 75.9% without mmlu
|
|
| 84 |
| bigbench_tracking_shuffled_objects_five_objects| 0| multiple_choice_grade | 0.2328 | _ | 0.0120 |
|
| 85 |
| bigbench_tracking_shuffled_objects_seven_objects| 0| multiple_choice_grade | 0.193714285714| _ | 0.0094 |
|
| 86 |
| bigbench_tracking_shuffled_objects_three_objects| 0| multiple_choice_grade | 0.593333333333| _ | 0.0284 |
|
| 87 |
-
|
| 88 |
Average: 49.08%
|
|
|
|
|
|
|
| 89 |
### Training hyperparameters
|
| 90 |
|
| 91 |
The following hyperparameters were used during training:
|
|
|
|
| 62 |
| | |mc2 |77.90|± | 1.37|
|
| 63 |
|
| 64 |
### BigBench Reasoning Test
|
| 65 |
+
```
|
| 66 |
| Task | Version | Metric | Value | | Stderr|
|
| 67 |
|------------------------------------------------|---------|-----------------------|--------------|---|-------|
|
| 68 |
| bigbench_causal_judgement | 0| multiple_choice_grade | 0.6000 | _ | 0.0356 |
|
|
|
|
| 85 |
| bigbench_tracking_shuffled_objects_five_objects| 0| multiple_choice_grade | 0.2328 | _ | 0.0120 |
|
| 86 |
| bigbench_tracking_shuffled_objects_seven_objects| 0| multiple_choice_grade | 0.193714285714| _ | 0.0094 |
|
| 87 |
| bigbench_tracking_shuffled_objects_three_objects| 0| multiple_choice_grade | 0.593333333333| _ | 0.0284 |
|
| 88 |
+
```
|
| 89 |
Average: 49.08%
|
| 90 |
+
|
| 91 |
+
|
| 92 |
### Training hyperparameters
|
| 93 |
|
| 94 |
The following hyperparameters were used during training:
|