mllm-dev commited on
Commit
34c539a
·
verified ·
1 Parent(s): 03c6f75

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,10 +1,10 @@
1
  ---
2
  base_model:
3
- - mllm-dev/gpt2_f_experiment_2
4
  - mllm-dev/gpt2_f_experiment_4
5
  - mllm-dev/gpt2_f_experiment_1
6
- - mllm-dev/gpt2_f_experiment_3
7
  - mllm-dev/gpt2_f_experiment_0
 
8
  library_name: transformers
9
  tags:
10
  - mergekit
@@ -18,14 +18,14 @@ This is a merge of pre-trained language models created using [mergekit](https://
18
  ## Merge Details
19
  ### Merge Method
20
 
21
- This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [mllm-dev/gpt2_f_experiment_0](https://huggingface.co/mllm-dev/gpt2_f_experiment_0) as a base.
22
 
23
  ### Models Merged
24
 
25
  The following models were included in the merge:
26
- * [mllm-dev/gpt2_f_experiment_2](https://huggingface.co/mllm-dev/gpt2_f_experiment_2)
27
  * [mllm-dev/gpt2_f_experiment_4](https://huggingface.co/mllm-dev/gpt2_f_experiment_4)
28
  * [mllm-dev/gpt2_f_experiment_1](https://huggingface.co/mllm-dev/gpt2_f_experiment_1)
 
29
  * [mllm-dev/gpt2_f_experiment_3](https://huggingface.co/mllm-dev/gpt2_f_experiment_3)
30
 
31
  ### Configuration
@@ -37,10 +37,7 @@ base_model:
37
  model:
38
  path: mllm-dev/gpt2_f_experiment_0
39
  dtype: float16
40
- merge_method: ties
41
- parameters:
42
- int8_mask: 0.0
43
- normalize: 1.0
44
  slices:
45
  - sources:
46
  - layer_range: [0, 12]
@@ -48,34 +45,29 @@ slices:
48
  model:
49
  path: mllm-dev/gpt2_f_experiment_0
50
  parameters:
51
- density: 1.0
52
  weight: 1.0
53
  - layer_range: [0, 12]
54
  model:
55
  model:
56
  path: mllm-dev/gpt2_f_experiment_1
57
  parameters:
58
- density: 1.0
59
  weight: 1.0
60
  - layer_range: [0, 12]
61
  model:
62
  model:
63
  path: mllm-dev/gpt2_f_experiment_2
64
  parameters:
65
- density: 1.0
66
  weight: 1.0
67
  - layer_range: [0, 12]
68
  model:
69
  model:
70
  path: mllm-dev/gpt2_f_experiment_3
71
  parameters:
72
- density: 1.0
73
  weight: 1.0
74
  - layer_range: [0, 12]
75
  model:
76
  model:
77
  path: mllm-dev/gpt2_f_experiment_4
78
  parameters:
79
- density: 1.0
80
  weight: 1.0
81
  ```
 
1
  ---
2
  base_model:
 
3
  - mllm-dev/gpt2_f_experiment_4
4
  - mllm-dev/gpt2_f_experiment_1
5
+ - mllm-dev/gpt2_f_experiment_2
6
  - mllm-dev/gpt2_f_experiment_0
7
+ - mllm-dev/gpt2_f_experiment_3
8
  library_name: transformers
9
  tags:
10
  - mergekit
 
18
  ## Merge Details
19
  ### Merge Method
20
 
21
+ This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method using [mllm-dev/gpt2_f_experiment_0](https://huggingface.co/mllm-dev/gpt2_f_experiment_0) as a base.
22
 
23
  ### Models Merged
24
 
25
  The following models were included in the merge:
 
26
  * [mllm-dev/gpt2_f_experiment_4](https://huggingface.co/mllm-dev/gpt2_f_experiment_4)
27
  * [mllm-dev/gpt2_f_experiment_1](https://huggingface.co/mllm-dev/gpt2_f_experiment_1)
28
+ * [mllm-dev/gpt2_f_experiment_2](https://huggingface.co/mllm-dev/gpt2_f_experiment_2)
29
  * [mllm-dev/gpt2_f_experiment_3](https://huggingface.co/mllm-dev/gpt2_f_experiment_3)
30
 
31
  ### Configuration
 
37
  model:
38
  path: mllm-dev/gpt2_f_experiment_0
39
  dtype: float16
40
+ merge_method: linear
 
 
 
41
  slices:
42
  - sources:
43
  - layer_range: [0, 12]
 
45
  model:
46
  path: mllm-dev/gpt2_f_experiment_0
47
  parameters:
 
48
  weight: 1.0
49
  - layer_range: [0, 12]
50
  model:
51
  model:
52
  path: mllm-dev/gpt2_f_experiment_1
53
  parameters:
 
54
  weight: 1.0
55
  - layer_range: [0, 12]
56
  model:
57
  model:
58
  path: mllm-dev/gpt2_f_experiment_2
59
  parameters:
 
60
  weight: 1.0
61
  - layer_range: [0, 12]
62
  model:
63
  model:
64
  path: mllm-dev/gpt2_f_experiment_3
65
  parameters:
 
66
  weight: 1.0
67
  - layer_range: [0, 12]
68
  model:
69
  model:
70
  path: mllm-dev/gpt2_f_experiment_4
71
  parameters:
 
72
  weight: 1.0
73
  ```
mergekit_config.yml CHANGED
@@ -2,10 +2,7 @@ base_model:
2
  model:
3
  path: mllm-dev/gpt2_f_experiment_0
4
  dtype: float16
5
- merge_method: ties
6
- parameters:
7
- int8_mask: 0.0
8
- normalize: 1.0
9
  slices:
10
  - sources:
11
  - layer_range: [0, 12]
@@ -13,33 +10,28 @@ slices:
13
  model:
14
  path: mllm-dev/gpt2_f_experiment_0
15
  parameters:
16
- density: 1.0
17
  weight: 1.0
18
  - layer_range: [0, 12]
19
  model:
20
  model:
21
  path: mllm-dev/gpt2_f_experiment_1
22
  parameters:
23
- density: 1.0
24
  weight: 1.0
25
  - layer_range: [0, 12]
26
  model:
27
  model:
28
  path: mllm-dev/gpt2_f_experiment_2
29
  parameters:
30
- density: 1.0
31
  weight: 1.0
32
  - layer_range: [0, 12]
33
  model:
34
  model:
35
  path: mllm-dev/gpt2_f_experiment_3
36
  parameters:
37
- density: 1.0
38
  weight: 1.0
39
  - layer_range: [0, 12]
40
  model:
41
  model:
42
  path: mllm-dev/gpt2_f_experiment_4
43
  parameters:
44
- density: 1.0
45
  weight: 1.0
 
2
  model:
3
  path: mllm-dev/gpt2_f_experiment_0
4
  dtype: float16
5
+ merge_method: linear
 
 
 
6
  slices:
7
  - sources:
8
  - layer_range: [0, 12]
 
10
  model:
11
  path: mllm-dev/gpt2_f_experiment_0
12
  parameters:
 
13
  weight: 1.0
14
  - layer_range: [0, 12]
15
  model:
16
  model:
17
  path: mllm-dev/gpt2_f_experiment_1
18
  parameters:
 
19
  weight: 1.0
20
  - layer_range: [0, 12]
21
  model:
22
  model:
23
  path: mllm-dev/gpt2_f_experiment_2
24
  parameters:
 
25
  weight: 1.0
26
  - layer_range: [0, 12]
27
  model:
28
  model:
29
  path: mllm-dev/gpt2_f_experiment_3
30
  parameters:
 
31
  weight: 1.0
32
  - layer_range: [0, 12]
33
  model:
34
  model:
35
  path: mllm-dev/gpt2_f_experiment_4
36
  parameters:
 
37
  weight: 1.0
model-00001-of-00001.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:11322e145e7b61665593903f460c972df3374b662f8ac11f087211938f7fd91c
3
  size 248902264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:109d22198c42220534f2b55ff9566334f14c2d3c6976f90d83b3d654b92dbc74
3
  size 248902264