Update README.md
Browse files
README.md
CHANGED
|
@@ -14,6 +14,18 @@ tags:
|
|
| 14 |
|
| 15 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
## Merge Details
|
| 18 |
### Merge Method
|
| 19 |
|
|
|
|
| 14 |
|
| 15 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
| 16 |
|
| 17 |
+
# First benchmarks
|
| 18 |
+
|
| 19 |
+
| Model | arc_challenge (0 shot) |
|
| 20 |
+
|----------------------------------------------------|------------------------|
|
| 21 |
+
| Qwen/Qwen3-1.7B | 43 |
|
| 22 |
+
| ibm-granite/granite-3.3-2b-base | 44,54 |
|
| 23 |
+
| deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | 34,9 |
|
| 24 |
+
| openbmb/MiniCPM-2B-dpo-bf16 | 44,28 |
|
| 25 |
+
| microsoft/bitnet-b1.58-2B-4T-bf16 (base model) | 47,95 |
|
| 26 |
+
| jpacifico/bitnet-dpo-merged-modelstock7 | **51,62** |
|
| 27 |
+
|
| 28 |
+
|
| 29 |
## Merge Details
|
| 30 |
### Merge Method
|
| 31 |
|