schonsense commited on
Commit
b2d885b
·
verified ·
1 Parent(s): 1a4dd33

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -58
README.md CHANGED
@@ -1,58 +1,58 @@
1
- ---
2
- base_model:
3
- - WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B
4
- - meta-llama/Llama-3.3-70B-Instruct
5
- library_name: transformers
6
- tags:
7
- - mergekit
8
- - merge
9
-
10
- ---
11
- # 70B_unstructWR
12
-
13
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
-
15
- ## Merge Details
16
- ### Merge Method
17
-
18
- This model was merged using the [DELLA](https://arxiv.org/abs/2406.11617) merge method using [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) as a base.
19
-
20
- ### Models Merged
21
-
22
- The following models were included in the merge:
23
- * [WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B](https://huggingface.co/WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B)
24
- * D:\mergekit\_My_YAMLS\70B_unstruct
25
-
26
- ### Configuration
27
-
28
- The following YAML configuration was used to produce this model:
29
-
30
- ```yaml
31
- models:
32
-
33
- - model: "D:\\mergekit\\_My_YAMLS\\70B_unstruct"
34
- parameters:
35
- density: 0.7
36
- epsilon: 0.2
37
- weight: 0.9
38
-
39
- - model: WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B
40
- parameters:
41
- density: 0.9
42
- epsilon: 0.05
43
- weight: 0.1
44
-
45
- - model: meta-llama/Llama-3.3-70B-Instruct
46
- merge_method: della
47
- base_model: meta-llama/Llama-3.3-70B-Instruct
48
- tokenizer_source: meta-llama/Llama-3.3-70B-Instruct
49
- parameters:
50
- normalize: false
51
- int8_mask: false
52
- lambda: 1.0
53
-
54
-
55
- dtype: float32
56
-
57
- out_dtype: bfloat16
58
- ```
 
1
+ ---
2
+ base_model:
3
+ - WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B
4
+ - meta-llama/Llama-3.3-70B-Instruct
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+
10
+ ---
11
+ # 70B_unstructWR
12
+
13
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
+
15
+ ## Merge Details
16
+ ### Merge Method
17
+
18
+ This model was merged using the [DELLA](https://arxiv.org/abs/2406.11617) merge method using [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) as a base.
19
+
20
+ ### Models Merged
21
+
22
+ The following models were included in the merge:
23
+ * [WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B](https://huggingface.co/WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B)
24
+ * D:\mergekit\_My_YAMLS\70B_unstruct
25
+
26
+ ### Configuration
27
+
28
+ The following YAML configuration was used to produce this model:
29
+
30
+ ```yaml
31
+ models:
32
+
33
+ - model: schonsense/70B_unstruct
34
+ parameters:
35
+ density: 0.7
36
+ epsilon: 0.2
37
+ weight: 0.9
38
+
39
+ - model: WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B
40
+ parameters:
41
+ density: 0.9
42
+ epsilon: 0.05
43
+ weight: 0.1
44
+
45
+ - model: meta-llama/Llama-3.3-70B-Instruct
46
+ merge_method: della
47
+ base_model: meta-llama/Llama-3.3-70B-Instruct
48
+ tokenizer_source: meta-llama/Llama-3.3-70B-Instruct
49
+ parameters:
50
+ normalize: false
51
+ int8_mask: false
52
+ lambda: 1.0
53
+
54
+
55
+ dtype: float32
56
+
57
+ out_dtype: bfloat16
58
+ ```