Improve language tag

#1
by lbourdois - opened
Files changed (1) hide show
  1. README.md +59 -46
README.md CHANGED
@@ -1,46 +1,59 @@
1
- ---
2
- base_model:
3
- - deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
4
- - NovaSky-AI/Sky-T1-32B-Preview
5
- - Qwen/Qwen2.5-32B
6
- - simplescaling/s1.1-32B
7
- library_name: transformers
8
- tags:
9
- - mergekit
10
- - merge
11
-
12
- ---
13
- # S1.1-R1-T1-32B
14
-
15
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
-
17
- ## Merge Details
18
- ### Merge Method
19
-
20
- This model was merged using the sce merge method using [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) as a base.
21
-
22
- ### Models Merged
23
-
24
- The following models were included in the merge:
25
- * [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)
26
- * [NovaSky-AI/Sky-T1-32B-Preview](https://huggingface.co/NovaSky-AI/Sky-T1-32B-Preview)
27
- * [simplescaling/s1.1-32B](https://huggingface.co/simplescaling/s1.1-32B)
28
-
29
- ### Configuration
30
-
31
- The following YAML configuration was used to produce this model:
32
-
33
- ```yaml
34
- models:
35
- # Pivot model
36
- - model: Qwen/Qwen2.5-32B
37
- # Target models
38
- - model: simplescaling/s1.1-32B
39
- - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
40
- - model: NovaSky-AI/Sky-T1-32B-Preview
41
- merge_method: sce
42
- base_model: Qwen/Qwen2.5-32B
43
- parameters:
44
- select_topk: 1.0
45
- dtype: bfloat16
46
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
4
+ - NovaSky-AI/Sky-T1-32B-Preview
5
+ - Qwen/Qwen2.5-32B
6
+ - simplescaling/s1.1-32B
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+ language:
12
+ - zho
13
+ - eng
14
+ - fra
15
+ - spa
16
+ - por
17
+ - deu
18
+ - ita
19
+ - rus
20
+ - jpn
21
+ - kor
22
+ - vie
23
+ - tha
24
+ - ara
25
+ ---
26
+ # S1.1-R1-T1-32B
27
+
28
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
29
+
30
+ ## Merge Details
31
+ ### Merge Method
32
+
33
+ This model was merged using the sce merge method using [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) as a base.
34
+
35
+ ### Models Merged
36
+
37
+ The following models were included in the merge:
38
+ * [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)
39
+ * [NovaSky-AI/Sky-T1-32B-Preview](https://huggingface.co/NovaSky-AI/Sky-T1-32B-Preview)
40
+ * [simplescaling/s1.1-32B](https://huggingface.co/simplescaling/s1.1-32B)
41
+
42
+ ### Configuration
43
+
44
+ The following YAML configuration was used to produce this model:
45
+
46
+ ```yaml
47
+ models:
48
+ # Pivot model
49
+ - model: Qwen/Qwen2.5-32B
50
+ # Target models
51
+ - model: simplescaling/s1.1-32B
52
+ - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
53
+ - model: NovaSky-AI/Sky-T1-32B-Preview
54
+ merge_method: sce
55
+ base_model: Qwen/Qwen2.5-32B
56
+ parameters:
57
+ select_topk: 1.0
58
+ dtype: bfloat16
59
+ ```