gss1147 commited on
Commit
430bb20
·
verified ·
1 Parent(s): e9c4629

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -54
README.md CHANGED
@@ -1,54 +1,54 @@
1
- ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
- ---
9
- # WithinUs_CPU_Hybrid
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the [SLERP](https://en.wikipedia.org/wiki/Slerp) merge method.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * C:/Users/GSS1147/Desktop/Qwen3-0.6B-Sushi-Code-Expert
22
- * X:/AI_Models/sayantan0013-math-stack_Qwen3-0
23
-
24
- ### Configuration
25
-
26
- The following YAML configuration was used to produce this model:
27
-
28
- ```yaml
29
- base_model: C:/Users/GSS1147/Desktop/Qwen3-0.6B-Sushi-Code-Expert
30
- dtype: float16
31
- merge_method: slerp
32
- parameters:
33
- t:
34
- - filter: embed_tokens
35
- value: 0.0
36
- - filter: self_attn
37
- value: 0.5
38
- - filter: mlp
39
- value: 0.5
40
- - filter: lm_head
41
- value: 1.0
42
- - value: 0.5
43
- slices:
44
- - sources:
45
- - layer_range:
46
- - 0
47
- - 28
48
- model: C:/Users/GSS1147/Desktop/Qwen3-0.6B-Sushi-Code-Expert
49
- - layer_range:
50
- - 0
51
- - 28
52
- model: X:/AI_Models/sayantan0013-math-stack_Qwen3-0
53
-
54
- ```
 
1
+ ---
2
+ base_model: []
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+
8
+ ---
9
+ # Qwen3-0.6B-Sushi-Math-Code-Expert
10
+
11
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
+
13
+ ## Merge Details
14
+ ### Merge Method
15
+
16
+ This model was merged using the [SLERP](https://en.wikipedia.org/wiki/Slerp) merge method.
17
+
18
+ ### Models Merged
19
+
20
+ The following models were included in the merge:
21
+ * Qwen3-0.6B-Sushi-Code-Expert
22
+ * sayantan0013-math-stack_Qwen3-0
23
+
24
+ ### Configuration
25
+
26
+ The following YAML configuration was used to produce this model:
27
+
28
+ ```yaml
29
+ base_model: C:/Users/GSS1147/Desktop/Qwen3-0.6B-Sushi-Code-Expert
30
+ dtype: float16
31
+ merge_method: slerp
32
+ parameters:
33
+ t:
34
+ - filter: embed_tokens
35
+ value: 0.0
36
+ - filter: self_attn
37
+ value: 0.5
38
+ - filter: mlp
39
+ value: 0.5
40
+ - filter: lm_head
41
+ value: 1.0
42
+ - value: 0.5
43
+ slices:
44
+ - sources:
45
+ - layer_range:
46
+ - 0
47
+ - 28
48
+ model: C:/Users/GSS1147/Desktop/Qwen3-0.6B-Sushi-Code-Expert
49
+ - layer_range:
50
+ - 0
51
+ - 28
52
+ model: X:/AI_Models/sayantan0013-math-stack_Qwen3-0
53
+
54
+ ```