Update README.md
Browse files
README.md
CHANGED
|
@@ -32,14 +32,9 @@ Evaluated on LM-Evaluation-Harness:
|
|
| 32 |
|
| 33 |
| Task | Metric | Score | Stderr |
|
| 34 |
|------|--------|-------|--------|
|
| 35 |
-
| **MMLU** | acc | **0.2376** | ±0.0037 |
|
| 36 |
-
| - Humanities | acc | 0.2472 | ±0.0067 |
|
| 37 |
-
| - STEM | acc | 0.2245 | ±0.0074 |
|
| 38 |
-
| - Social Sciences | acc | 0.2327 | ±0.0076 |
|
| 39 |
-
| - Other | acc | 0.2430 | ±0.0077 |
|
| 40 |
-
| **GSM8K** | exact_match | **0.0240** | ±0.0048 |
|
| 41 |
| **HellaSwag** | acc_norm | **0.4430** | ±0.0157 |
|
| 42 |
| **ARC-Easy** | acc_norm | **0.5450** | ±0.0158 |
|
|
|
|
| 43 |
| **PIQA** | acc_norm | **0.6770** | ±0.0148 |
|
| 44 |
| **WinoGrande** | acc | **0.5210** | ±0.0158 |
|
| 45 |
|
|
|
|
| 32 |
|
| 33 |
| Task | Metric | Score | Stderr |
|
| 34 |
|------|--------|-------|--------|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
| **HellaSwag** | acc_norm | **0.4430** | ±0.0157 |
|
| 36 |
| **ARC-Easy** | acc_norm | **0.5450** | ±0.0158 |
|
| 37 |
+
| **ARC-Easy** | acc_norm | **0.2884** | ±0.0132 |
|
| 38 |
| **PIQA** | acc_norm | **0.6770** | ±0.0148 |
|
| 39 |
| **WinoGrande** | acc | **0.5210** | ±0.0158 |
|
| 40 |
|