Update README.md
Browse files
README.md
CHANGED
|
@@ -101,23 +101,20 @@ This is a test project for merging models.
|
|
| 101 |
|
| 102 |
# Open LLM Leaderboard Evaluation Results
|
| 103 |
|
| 104 |
-
Detailed results can be found here.
|
| 105 |
|
| 106 |
| Metric | Value |
|
| 107 |
|-----------------------|---------------------------|
|
| 108 |
-
| Avg. |
|
| 109 |
-
| ARC (25-shot) |
|
| 110 |
-
| HellaSwag (10-shot) |
|
| 111 |
-
| MMLU (5-shot) |
|
| 112 |
-
| TruthfulQA (0-shot) |
|
| 113 |
-
| Winogrande (5-shot) |
|
| 114 |
-
| GSM8K (5-shot) |
|
| 115 |
|
| 116 |
# Acknowlegement
|
| 117 |
-
- [mergekit](https://github.com/cg123/mergekit
|
| 118 |
-
)
|
| 119 |
- [DARE](https://github.com/yule-BUAA/MergeLM/blob/main/README.md)
|
| 120 |
-
-
|
| 121 |
-
[SLERP](https://github.com/Digitous/LLM-SLERP-Merge)
|
| 122 |
-
|
| 123 |
- [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
|
|
|
|
| 101 |
|
| 102 |
# Open LLM Leaderboard Evaluation Results
|
| 103 |
|
| 104 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_jan-hq__trinity-v1).
|
| 105 |
|
| 106 |
| Metric | Value |
|
| 107 |
|-----------------------|---------------------------|
|
| 108 |
+
| Avg. | 74.8|
|
| 109 |
+
| ARC (25-shot) | 72.27 |
|
| 110 |
+
| HellaSwag (10-shot) | 88.36 |
|
| 111 |
+
| MMLU (5-shot) | 65.2|
|
| 112 |
+
| TruthfulQA (0-shot) | 69.31 |
|
| 113 |
+
| Winogrande (5-shot) | 82 |
|
| 114 |
+
| GSM8K (5-shot) | 71.65 |
|
| 115 |
|
| 116 |
# Acknowlegement
|
| 117 |
+
- [mergekit](https://github.com/cg123/mergekit)
|
|
|
|
| 118 |
- [DARE](https://github.com/yule-BUAA/MergeLM/blob/main/README.md)
|
| 119 |
+
- [SLERP](https://github.com/Digitous/LLM-SLERP-Merge)
|
|
|
|
|
|
|
| 120 |
- [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
|