Improve language tag

#1
by lbourdois - opened
Files changed (1) hide show
  1. README.md +61 -47
README.md CHANGED
@@ -1,48 +1,62 @@
1
- ---
2
- base_model:
3
- - deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
4
- - Qwen/Qwen2.5-32B
5
- library_name: transformers
6
- tags:
7
- - mergekit
8
- - merge
9
- license: mit
10
- ---
11
- # merge
12
-
13
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
-
15
- ## Merge Details
16
- ### Merge Method
17
-
18
- This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) as a base.
19
-
20
- ### Models Merged
21
-
22
- The following models were included in the merge:
23
- * [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)
24
-
25
- ### Configuration
26
-
27
- The following YAML configuration was used to produce this model:
28
-
29
- ```yaml
30
- models:
31
- - model: Qwen/Qwen2.5-32B
32
- #no parameters necessary for base model
33
- - model: Qwen/Qwen2.5-32B
34
- parameters:
35
- density: 0.5
36
- weight: 0.5
37
- - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
38
- parameters:
39
- density: 0.5
40
- weight: 0.5
41
-
42
- merge_method: ties
43
- base_model: Qwen/Qwen2.5-32B
44
- parameters:
45
- normalize: false
46
- int8_mask: true
47
- dtype: float16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  ```
 
1
+ ---
2
+ base_model:
3
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
4
+ - Qwen/Qwen2.5-32B
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+ license: mit
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ ---
25
+ # merge
26
+
27
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
28
+
29
+ ## Merge Details
30
+ ### Merge Method
31
+
32
+ This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) as a base.
33
+
34
+ ### Models Merged
35
+
36
+ The following models were included in the merge:
37
+ * [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)
38
+
39
+ ### Configuration
40
+
41
+ The following YAML configuration was used to produce this model:
42
+
43
+ ```yaml
44
+ models:
45
+ - model: Qwen/Qwen2.5-32B
46
+ #no parameters necessary for base model
47
+ - model: Qwen/Qwen2.5-32B
48
+ parameters:
49
+ density: 0.5
50
+ weight: 0.5
51
+ - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
52
+ parameters:
53
+ density: 0.5
54
+ weight: 0.5
55
+
56
+ merge_method: ties
57
+ base_model: Qwen/Qwen2.5-32B
58
+ parameters:
59
+ normalize: false
60
+ int8_mask: true
61
+ dtype: float16
62
  ```