Update README.md
Browse files
README.md
CHANGED
|
@@ -95,3 +95,16 @@ while True:
|
|
| 95 |
|
| 96 |
```
|
| 97 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 95 |
|
| 96 |
```
|
| 97 |
|
| 98 |
+
## Evaluations
|
| 99 |
+
The following data has been re-evaluated and calculated as the average for each test.
|
| 100 |
+
|
| 101 |
+
| Benchmark | Qwen2.5-Coder-7B-Instruct | Qwen2.5-Coder-7B-Instruct-abliterated |
|
| 102 |
+
|-------------|---------------------------|---------------------------------------|
|
| 103 |
+
| IF_Eval | **63.14** | 61.90 |
|
| 104 |
+
| MMLU Pro | 33.54 | **33.56** |
|
| 105 |
+
| TruthfulQA | **51.804** | 48.8 |
|
| 106 |
+
| BBH | 46.98 | **47.17** |
|
| 107 |
+
| GPQA | **32.85** | 32.63 |
|
| 108 |
+
|
| 109 |
+
The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-Coder-7B-Instruct-abliterated/blob/main/eval.sh)
|
| 110 |
+
|