Update README.md
Browse files
README.md
CHANGED
|
@@ -73,16 +73,16 @@ In this section, we report the evaluation results of SmolLM2. All evaluations ar
|
|
| 73 |
|
| 74 |
| Metric | SmolLM2-1.7B-Instruct | SmolLM2-1.7B-Humanized | Difference |
|
| 75 |
|:-----------------------------|:---------------------:|:----------------------:|:----------:|
|
| 76 |
-
| MMLU | **49.5** |
|
| 77 |
-
| ARC (Easy) | **68.9** |
|
| 78 |
-
| ARC (Challenge) |
|
| 79 |
-
| HellaSwag | **71.7** | 71.
|
| 80 |
-
| PIQA | **76.2** | 75.
|
| 81 |
-
| WinoGrande | **62.5** | 61.
|
| 82 |
-
| TriviaQA | **10.2** |
|
| 83 |
| GSM8K | **0.0** | **0.0** | +0.0 |
|
| 84 |
-
| OpenBookQA | **45.6** |
|
| 85 |
-
| QuAC (F1) |
|
| 86 |
|
| 87 |
|
| 88 |
## Limitations
|
|
|
|
| 73 |
|
| 74 |
| Metric | SmolLM2-1.7B-Instruct | SmolLM2-1.7B-Humanized | Difference |
|
| 75 |
|:-----------------------------|:---------------------:|:----------------------:|:----------:|
|
| 76 |
+
| MMLU | **49.5** | 48.8 | -0.7 |
|
| 77 |
+
| ARC (Easy) | **68.9** | 64.9 | -4.0 |
|
| 78 |
+
| ARC (Challenge) | 38.5 | **40.3** | +1.8 |
|
| 79 |
+
| HellaSwag | **71.7** | 71.3 | -0.4 |
|
| 80 |
+
| PIQA | **76.2** | 75.8 | -0.6 |
|
| 81 |
+
| WinoGrande | **62.5** | 61.2 | -1.3 |
|
| 82 |
+
| TriviaQA | **10.2** | 1.3 | -8.9 |
|
| 83 |
| GSM8K | **0.0** | **0.0** | +0.0 |
|
| 84 |
+
| OpenBookQA | **45.6** | 44.8 | -0.8 |
|
| 85 |
+
| QuAC (F1) | 30.2 | **31.1** | +0.9 |
|
| 86 |
|
| 87 |
|
| 88 |
## Limitations
|