Update README.md
Browse files
README.md
CHANGED
|
@@ -101,12 +101,12 @@ while True:
|
|
| 101 |
The following data has been re-evaluated and calculated as the average for each test.
|
| 102 |
|
| 103 |
|
| 104 |
-
|
|
| 105 |
-
|
| 106 |
-
|
|
| 107 |
-
|
|
| 108 |
-
|
|
| 109 |
-
|
|
| 110 |
-
|
|
| 111 |
|
| 112 |
The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2/blob/main/eval.sh)
|
|
|
|
| 101 |
The following data has been re-evaluated and calculated as the average for each test.
|
| 102 |
|
| 103 |
|
| 104 |
+
| Benchmark | Qwen2.5-7B-Instruct | Qwen2.5-7B-Instruct-abliterated-v2 | Qwen2.5-7B-Instruct-abliterated |
|
| 105 |
+
|-------------|---------------------|------------------------------------|---------------------------------|
|
| 106 |
+
| IF_Eval | 76.44 | **77.82** | 76.49 |
|
| 107 |
+
| MMLU Pro | **43.12** | 42.03 | 41.71 |
|
| 108 |
+
| TruthfulQA | 62.46 | 57.81 | **64.92** |
|
| 109 |
+
| BBH | **53.92** | 53.01 | 52.77 |
|
| 110 |
+
| GPQA | 31.91 | **32.17** | 31.97 |
|
| 111 |
|
| 112 |
The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2/blob/main/eval.sh)
|