schonsense commited on
Commit
a838d27
·
verified ·
1 Parent(s): 5dc295f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -67
README.md CHANGED
@@ -1,67 +1,69 @@
1
- ---
2
- base_model:
3
- - schonsense/SOG_10k_70B
4
- - meta-llama/Llama-3.1-70B
5
- - meta-llama/Llama-3.3-70B-Instruct
6
- - schonsense/70B_unstructWR
7
- library_name: transformers
8
- tags:
9
- - mergekit
10
- - merge
11
-
12
- ---
13
- # 70B_SOG_unstructed
14
-
15
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
-
17
- ## Merge Details
18
- ### Merge Method
19
-
20
- This model was merged using the [DELLA](https://arxiv.org/abs/2406.11617) merge method using [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) as a base.
21
-
22
- ### Models Merged
23
-
24
- The following models were included in the merge:
25
- * [schonsense/SOG_10k_70B](https://huggingface.co/schonsense/SOG_10k_70B)
26
- * [meta-llama/Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B)
27
- * [schonsense/70B_unstructWR](https://huggingface.co/schonsense/70B_unstructWR)
28
-
29
- ### Configuration
30
-
31
- The following YAML configuration was used to produce this model:
32
-
33
- ```yaml
34
- models:
35
-
36
- - model: schonsense/70B_unstructWR
37
- parameters:
38
- density: 0.7
39
- epsilon: 0.2
40
- weight: 0.4
41
-
42
- - model: schonsense/SOG_10k_70B
43
- parameters:
44
- density: 0.7
45
- epsilon: 0.2
46
- weight: 0.5
47
-
48
- - model: meta-llama/Llama-3.1-70B
49
- parameters:
50
- density: 0.8
51
- epsilon: 0.1
52
- weight: 0.1
53
-
54
- - model: meta-llama/Llama-3.3-70B-Instruct
55
- merge_method: della
56
- base_model: meta-llama/Llama-3.3-70B-Instruct
57
- tokenizer_source: meta-llama/Llama-3.3-70B-Instruct
58
- parameters:
59
- normalize: false
60
- int8_mask: false
61
- lambda: 1.0
62
-
63
-
64
- dtype: float32
65
-
66
- out_dtype: bfloat16
67
- ```
 
 
 
1
+ ---
2
+ base_model:
3
+ - schonsense/SOG_10k_70B
4
+ - meta-llama/Llama-3.1-70B
5
+ - meta-llama/Llama-3.3-70B-Instruct
6
+ - schonsense/70B_unstructWR
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+
12
+ ---
13
+ # 70B_SOG_unstructed
14
+
15
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317d4867690c5b55e61ce3d/Ffz7KZjg70xYP58TRhIiG.png)
16
+
17
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
18
+
19
+ ## Merge Details
20
+ ### Merge Method
21
+
22
+ This model was merged using the [DELLA](https://arxiv.org/abs/2406.11617) merge method using [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) as a base.
23
+
24
+ ### Models Merged
25
+
26
+ The following models were included in the merge:
27
+ * [schonsense/SOG_10k_70B](https://huggingface.co/schonsense/SOG_10k_70B)
28
+ * [meta-llama/Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B)
29
+ * [schonsense/70B_unstructWR](https://huggingface.co/schonsense/70B_unstructWR)
30
+
31
+ ### Configuration
32
+
33
+ The following YAML configuration was used to produce this model:
34
+
35
+ ```yaml
36
+ models:
37
+
38
+ - model: schonsense/70B_unstructWR
39
+ parameters:
40
+ density: 0.7
41
+ epsilon: 0.2
42
+ weight: 0.4
43
+
44
+ - model: schonsense/SOG_10k_70B
45
+ parameters:
46
+ density: 0.7
47
+ epsilon: 0.2
48
+ weight: 0.5
49
+
50
+ - model: meta-llama/Llama-3.1-70B
51
+ parameters:
52
+ density: 0.8
53
+ epsilon: 0.1
54
+ weight: 0.1
55
+
56
+ - model: meta-llama/Llama-3.3-70B-Instruct
57
+ merge_method: della
58
+ base_model: meta-llama/Llama-3.3-70B-Instruct
59
+ tokenizer_source: meta-llama/Llama-3.3-70B-Instruct
60
+ parameters:
61
+ normalize: false
62
+ int8_mask: false
63
+ lambda: 1.0
64
+
65
+
66
+ dtype: float32
67
+
68
+ out_dtype: bfloat16
69
+ ```