Improve language tag

#2
by lbourdois - opened
Files changed (1) hide show
  1. README.md +87 -73
README.md CHANGED
@@ -1,74 +1,88 @@
1
- ---
2
- base_model:
3
- - Qwen/Qwen2.5-14B-Instruct
4
- - Qwen/Qwen2.5-14B
5
- - CultriX/Qwen2.5-14B-Hyperionv3_r128
6
- - CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128
7
- library_name: transformers
8
- tags:
9
- - mergekit
10
- - merge
11
- license: apache-2.0
12
- ---
13
- # Qwen2.5-DeepHyper
14
-
15
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
-
17
- ## Merge Details
18
- ### Merge Method
19
-
20
- This model was merged using the [DARE TIES](https://arxiv.org/abs/2311.03099) merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
21
-
22
- ### Models Merged
23
-
24
- The following models were included in the merge:
25
- * [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
26
- * [CultriX/Qwen2.5-14B-Hyperionv3_r128](https://huggingface.co/CultriX/Qwen2.5-14B-Hyperionv3_r128)
27
- * /root/.cache/huggingface/hub/models--CultriX--Qwen2.5-14B-DeepSeek_r128/snapshots/1bca847f92fced165076d9ac921a1e3ef01fcd7f/
28
- * [CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128](https://huggingface.co/CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128)
29
-
30
- ### Configuration
31
-
32
- The following YAML configuration was used to produce this model:
33
-
34
- ```yaml
35
- base_model: Qwen/Qwen2.5-14B
36
- models:
37
- # Each adapter was extracted (rank=128) from its respective finetuned model.
38
- # Their weights are set lower than the full instruct model (which is now the base)
39
- - model: CultriX/Qwen2.5-14B-Hyperionv3_r128
40
- parameters:
41
- weight: 0.9 # Reduced weight relative to base
42
- density: 0.9
43
-
44
- - model: CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128
45
- parameters:
46
- weight: 1.0
47
- density: 1.0
48
-
49
- - model: Qwen/Qwen2.5-14B-Instruct
50
- parameters:
51
- weight: 0.75
52
- density: 0.75
53
-
54
- - model: /root/.cache/huggingface/hub/models--CultriX--Qwen2.5-14B-DeepSeek_r128/snapshots/1bca847f92fced165076d9ac921a1e3ef01fcd7f/
55
- parameters:
56
- weight: 1.00
57
- density: 1.00
58
-
59
- # Merging method and overall parameters
60
- merge_method: dare_ties # Ties corresponding weights across sources.
61
- parameters:
62
- weight: 1.0 # Overall scaling factor.
63
- density: 1.0 # Overall density (typically left at 1.0).
64
- normalize: true # Normalize each set of weights before merging.
65
- int8_mask: true # Enable masking if using int8 quantized weights.
66
-
67
- # Use the instruct tokenizer to ensure compatibility.
68
- tokenizer_source: CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128
69
-
70
- # Data type for merged weights.
71
- dtype: bfloat16
72
-
73
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  ```
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-14B-Instruct
4
+ - Qwen/Qwen2.5-14B
5
+ - CultriX/Qwen2.5-14B-Hyperionv3_r128
6
+ - CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+ license: apache-2.0
12
+ language:
13
+ - zho
14
+ - eng
15
+ - fra
16
+ - spa
17
+ - por
18
+ - deu
19
+ - ita
20
+ - rus
21
+ - jpn
22
+ - kor
23
+ - vie
24
+ - tha
25
+ - ara
26
+ ---
27
+ # Qwen2.5-DeepHyper
28
+
29
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
30
+
31
+ ## Merge Details
32
+ ### Merge Method
33
+
34
+ This model was merged using the [DARE TIES](https://arxiv.org/abs/2311.03099) merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
35
+
36
+ ### Models Merged
37
+
38
+ The following models were included in the merge:
39
+ * [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
40
+ * [CultriX/Qwen2.5-14B-Hyperionv3_r128](https://huggingface.co/CultriX/Qwen2.5-14B-Hyperionv3_r128)
41
+ * /root/.cache/huggingface/hub/models--CultriX--Qwen2.5-14B-DeepSeek_r128/snapshots/1bca847f92fced165076d9ac921a1e3ef01fcd7f/
42
+ * [CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128](https://huggingface.co/CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128)
43
+
44
+ ### Configuration
45
+
46
+ The following YAML configuration was used to produce this model:
47
+
48
+ ```yaml
49
+ base_model: Qwen/Qwen2.5-14B
50
+ models:
51
+ # Each adapter was extracted (rank=128) from its respective finetuned model.
52
+ # Their weights are set lower than the full instruct model (which is now the base)
53
+ - model: CultriX/Qwen2.5-14B-Hyperionv3_r128
54
+ parameters:
55
+ weight: 0.9 # Reduced weight relative to base
56
+ density: 0.9
57
+
58
+ - model: CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128
59
+ parameters:
60
+ weight: 1.0
61
+ density: 1.0
62
+
63
+ - model: Qwen/Qwen2.5-14B-Instruct
64
+ parameters:
65
+ weight: 0.75
66
+ density: 0.75
67
+
68
+ - model: /root/.cache/huggingface/hub/models--CultriX--Qwen2.5-14B-DeepSeek_r128/snapshots/1bca847f92fced165076d9ac921a1e3ef01fcd7f/
69
+ parameters:
70
+ weight: 1.00
71
+ density: 1.00
72
+
73
+ # Merging method and overall parameters
74
+ merge_method: dare_ties # Ties corresponding weights across sources.
75
+ parameters:
76
+ weight: 1.0 # Overall scaling factor.
77
+ density: 1.0 # Overall density (typically left at 1.0).
78
+ normalize: true # Normalize each set of weights before merging.
79
+ int8_mask: true # Enable masking if using int8 quantized weights.
80
+
81
+ # Use the instruct tokenizer to ensure compatibility.
82
+ tokenizer_source: CultriX/Qwen2.5-14B_Virtuoso-small-v2-LoRA_r128
83
+
84
+ # Data type for merged weights.
85
+ dtype: bfloat16
86
+
87
+
88
  ```