Update README.md
Browse files
README.md
CHANGED
|
@@ -30,22 +30,31 @@ The yaml config file for this model is here:
|
|
| 30 |
slices:
|
| 31 |
- sources:
|
| 32 |
- model: viethq188/LeoScorpius-7B-Chat-DPO
|
| 33 |
-
|
| 34 |
-
layer_range: [0, 32]
|
| 35 |
- model: GreenNode/GreenNodeLM-7B-v1olet
|
| 36 |
layer_range: [0, 32]
|
| 37 |
merge_method: slerp
|
| 38 |
base_model: GreenNode/GreenNodeLM-7B-v1olet
|
| 39 |
parameters:
|
| 40 |
t:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
- filter: self_attn
|
| 42 |
-
value: [0
|
| 43 |
- filter: mlp
|
| 44 |
-
value:
|
| 45 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
dtype: bfloat16
|
| 47 |
```
|
| 48 |
|
|
|
|
|
|
|
| 49 |
# Prompt template
|
| 50 |
|
| 51 |
- **ChatML**
|
|
@@ -93,7 +102,8 @@ Detailed results can be found here.
|
|
| 93 |
| GSM8K (5-shot) | ? |
|
| 94 |
|
| 95 |
# Acknowlegement
|
| 96 |
-
- [mergekit](https://github.com/cg123/mergekit
|
|
|
|
| 97 |
- [DARE](https://github.com/yule-BUAA/MergeLM/blob/main/README.md)
|
| 98 |
-
|
| 99 |
[SLERP](https://github.com/Digitous/LLM-SLERP-Merge)
|
|
|
|
| 30 |
slices:
|
| 31 |
- sources:
|
| 32 |
- model: viethq188/LeoScorpius-7B-Chat-DPO
|
| 33 |
+
layer_range: [0, 32]
|
|
|
|
| 34 |
- model: GreenNode/GreenNodeLM-7B-v1olet
|
| 35 |
layer_range: [0, 32]
|
| 36 |
merge_method: slerp
|
| 37 |
base_model: GreenNode/GreenNodeLM-7B-v1olet
|
| 38 |
parameters:
|
| 39 |
t:
|
| 40 |
+
- filter: lm_head
|
| 41 |
+
value: [0.55]
|
| 42 |
+
- filter: embed_tokens
|
| 43 |
+
value: [0.7]
|
| 44 |
- filter: self_attn
|
| 45 |
+
value: [0.65, 0.35]
|
| 46 |
- filter: mlp
|
| 47 |
+
value: [0.35, 0.65]
|
| 48 |
+
- filter: layernorm
|
| 49 |
+
value: [0.4, 0.6]
|
| 50 |
+
- filter: modelnorm
|
| 51 |
+
value: [0.6]
|
| 52 |
+
- value: 0.5 # fallback for rest of tensors
|
| 53 |
dtype: bfloat16
|
| 54 |
```
|
| 55 |
|
| 56 |
+
Thank you [Undi95](https://huggingface.co/Undi95) for the secret sauce and (Charles Goddard)[https://huggingface.co/chargoddard] for mergekit.
|
| 57 |
+
|
| 58 |
# Prompt template
|
| 59 |
|
| 60 |
- **ChatML**
|
|
|
|
| 102 |
| GSM8K (5-shot) | ? |
|
| 103 |
|
| 104 |
# Acknowlegement
|
| 105 |
+
- [mergekit](https://github.com/cg123/mergekit
|
| 106 |
+
)
|
| 107 |
- [DARE](https://github.com/yule-BUAA/MergeLM/blob/main/README.md)
|
| 108 |
-
|
| 109 |
[SLERP](https://github.com/Digitous/LLM-SLERP-Merge)
|