mllm-dev commited on
Commit
d3b567f
·
verified ·
1 Parent(s): c0c0028

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,15 +1,15 @@
1
  ---
2
  base_model:
3
- - mllm-dev/gpt2_f_experiment_2
4
- - mllm-dev/gpt2_f_experiment_4
5
- - mllm-dev/gpt2_f_experiment_5
6
  - mllm-dev/gpt2_f_experiment_7
7
- - mllm-dev/gpt2_f_experiment_9
8
- - mllm-dev/gpt2_f_experiment_1
9
- - mllm-dev/gpt2_f_experiment_8
10
  - mllm-dev/gpt2_f_experiment_3
 
 
11
  - mllm-dev/gpt2_f_experiment_0
 
 
12
  - mllm-dev/gpt2_f_experiment_6
 
 
13
  library_name: transformers
14
  tags:
15
  - mergekit
@@ -23,20 +23,20 @@ This is a merge of pre-trained language models created using [mergekit](https://
23
  ## Merge Details
24
  ### Merge Method
25
 
26
- This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [mllm-dev/gpt2_f_experiment_0](https://huggingface.co/mllm-dev/gpt2_f_experiment_0) as a base.
27
 
28
  ### Models Merged
29
 
30
  The following models were included in the merge:
31
- * [mllm-dev/gpt2_f_experiment_2](https://huggingface.co/mllm-dev/gpt2_f_experiment_2)
32
- * [mllm-dev/gpt2_f_experiment_4](https://huggingface.co/mllm-dev/gpt2_f_experiment_4)
33
- * [mllm-dev/gpt2_f_experiment_5](https://huggingface.co/mllm-dev/gpt2_f_experiment_5)
34
  * [mllm-dev/gpt2_f_experiment_7](https://huggingface.co/mllm-dev/gpt2_f_experiment_7)
35
- * [mllm-dev/gpt2_f_experiment_9](https://huggingface.co/mllm-dev/gpt2_f_experiment_9)
36
- * [mllm-dev/gpt2_f_experiment_1](https://huggingface.co/mllm-dev/gpt2_f_experiment_1)
37
- * [mllm-dev/gpt2_f_experiment_8](https://huggingface.co/mllm-dev/gpt2_f_experiment_8)
38
  * [mllm-dev/gpt2_f_experiment_3](https://huggingface.co/mllm-dev/gpt2_f_experiment_3)
 
 
 
 
39
  * [mllm-dev/gpt2_f_experiment_6](https://huggingface.co/mllm-dev/gpt2_f_experiment_6)
 
 
40
 
41
  ### Configuration
42
 
@@ -47,7 +47,7 @@ base_model:
47
  model:
48
  path: mllm-dev/gpt2_f_experiment_0
49
  dtype: float16
50
- merge_method: ties
51
  parameters:
52
  normalize: 1.0
53
  slices:
@@ -56,67 +56,60 @@ slices:
56
  model:
57
  model:
58
  path: mllm-dev/gpt2_f_experiment_0
 
 
59
  - layer_range: [0, 12]
60
  model:
61
  model:
62
  path: mllm-dev/gpt2_f_experiment_1
63
  parameters:
64
- density: 0.9
65
  weight: 0.1
66
  - layer_range: [0, 12]
67
  model:
68
  model:
69
  path: mllm-dev/gpt2_f_experiment_2
70
  parameters:
71
- density: 0.9
72
  weight: 0.1
73
  - layer_range: [0, 12]
74
  model:
75
  model:
76
  path: mllm-dev/gpt2_f_experiment_3
77
  parameters:
78
- density: 0.9
79
  weight: 0.1
80
  - layer_range: [0, 12]
81
  model:
82
  model:
83
  path: mllm-dev/gpt2_f_experiment_4
84
  parameters:
85
- density: 0.9
86
  weight: 0.1
87
  - layer_range: [0, 12]
88
  model:
89
  model:
90
  path: mllm-dev/gpt2_f_experiment_5
91
  parameters:
92
- density: 0.9
93
  weight: 0.1
94
  - layer_range: [0, 12]
95
  model:
96
  model:
97
  path: mllm-dev/gpt2_f_experiment_6
98
  parameters:
99
- density: 0.9
100
  weight: 0.1
101
  - layer_range: [0, 12]
102
  model:
103
  model:
104
  path: mllm-dev/gpt2_f_experiment_7
105
  parameters:
106
- density: 0.9
107
  weight: 0.1
108
  - layer_range: [0, 12]
109
  model:
110
  model:
111
  path: mllm-dev/gpt2_f_experiment_8
112
  parameters:
113
- density: 0.9
114
  weight: 0.1
115
  - layer_range: [0, 12]
116
  model:
117
  model:
118
  path: mllm-dev/gpt2_f_experiment_9
119
  parameters:
120
- density: 0.9
121
  weight: 0.1
122
  ```
 
1
  ---
2
  base_model:
 
 
 
3
  - mllm-dev/gpt2_f_experiment_7
 
 
 
4
  - mllm-dev/gpt2_f_experiment_3
5
+ - mllm-dev/gpt2_f_experiment_8
6
+ - mllm-dev/gpt2_f_experiment_1
7
  - mllm-dev/gpt2_f_experiment_0
8
+ - mllm-dev/gpt2_f_experiment_4
9
+ - mllm-dev/gpt2_f_experiment_2
10
  - mllm-dev/gpt2_f_experiment_6
11
+ - mllm-dev/gpt2_f_experiment_5
12
+ - mllm-dev/gpt2_f_experiment_9
13
  library_name: transformers
14
  tags:
15
  - mergekit
 
23
  ## Merge Details
24
  ### Merge Method
25
 
26
+ This model was merged using the linear [DARE](https://arxiv.org/abs/2311.03099) merge method using [mllm-dev/gpt2_f_experiment_0](https://huggingface.co/mllm-dev/gpt2_f_experiment_0) as a base.
27
 
28
  ### Models Merged
29
 
30
  The following models were included in the merge:
 
 
 
31
  * [mllm-dev/gpt2_f_experiment_7](https://huggingface.co/mllm-dev/gpt2_f_experiment_7)
 
 
 
32
  * [mllm-dev/gpt2_f_experiment_3](https://huggingface.co/mllm-dev/gpt2_f_experiment_3)
33
+ * [mllm-dev/gpt2_f_experiment_8](https://huggingface.co/mllm-dev/gpt2_f_experiment_8)
34
+ * [mllm-dev/gpt2_f_experiment_1](https://huggingface.co/mllm-dev/gpt2_f_experiment_1)
35
+ * [mllm-dev/gpt2_f_experiment_4](https://huggingface.co/mllm-dev/gpt2_f_experiment_4)
36
+ * [mllm-dev/gpt2_f_experiment_2](https://huggingface.co/mllm-dev/gpt2_f_experiment_2)
37
  * [mllm-dev/gpt2_f_experiment_6](https://huggingface.co/mllm-dev/gpt2_f_experiment_6)
38
+ * [mllm-dev/gpt2_f_experiment_5](https://huggingface.co/mllm-dev/gpt2_f_experiment_5)
39
+ * [mllm-dev/gpt2_f_experiment_9](https://huggingface.co/mllm-dev/gpt2_f_experiment_9)
40
 
41
  ### Configuration
42
 
 
47
  model:
48
  path: mllm-dev/gpt2_f_experiment_0
49
  dtype: float16
50
+ merge_method: dare_linear
51
  parameters:
52
  normalize: 1.0
53
  slices:
 
56
  model:
57
  model:
58
  path: mllm-dev/gpt2_f_experiment_0
59
+ parameters:
60
+ weight: 0.1
61
  - layer_range: [0, 12]
62
  model:
63
  model:
64
  path: mllm-dev/gpt2_f_experiment_1
65
  parameters:
 
66
  weight: 0.1
67
  - layer_range: [0, 12]
68
  model:
69
  model:
70
  path: mllm-dev/gpt2_f_experiment_2
71
  parameters:
 
72
  weight: 0.1
73
  - layer_range: [0, 12]
74
  model:
75
  model:
76
  path: mllm-dev/gpt2_f_experiment_3
77
  parameters:
 
78
  weight: 0.1
79
  - layer_range: [0, 12]
80
  model:
81
  model:
82
  path: mllm-dev/gpt2_f_experiment_4
83
  parameters:
 
84
  weight: 0.1
85
  - layer_range: [0, 12]
86
  model:
87
  model:
88
  path: mllm-dev/gpt2_f_experiment_5
89
  parameters:
 
90
  weight: 0.1
91
  - layer_range: [0, 12]
92
  model:
93
  model:
94
  path: mllm-dev/gpt2_f_experiment_6
95
  parameters:
 
96
  weight: 0.1
97
  - layer_range: [0, 12]
98
  model:
99
  model:
100
  path: mllm-dev/gpt2_f_experiment_7
101
  parameters:
 
102
  weight: 0.1
103
  - layer_range: [0, 12]
104
  model:
105
  model:
106
  path: mllm-dev/gpt2_f_experiment_8
107
  parameters:
 
108
  weight: 0.1
109
  - layer_range: [0, 12]
110
  model:
111
  model:
112
  path: mllm-dev/gpt2_f_experiment_9
113
  parameters:
 
114
  weight: 0.1
115
  ```
mergekit_config.yml CHANGED
@@ -2,7 +2,7 @@ base_model:
2
  model:
3
  path: mllm-dev/gpt2_f_experiment_0
4
  dtype: float16
5
- merge_method: ties
6
  parameters:
7
  normalize: 1.0
8
  slices:
@@ -11,66 +11,59 @@ slices:
11
  model:
12
  model:
13
  path: mllm-dev/gpt2_f_experiment_0
 
 
14
  - layer_range: [0, 12]
15
  model:
16
  model:
17
  path: mllm-dev/gpt2_f_experiment_1
18
  parameters:
19
- density: 0.9
20
  weight: 0.1
21
  - layer_range: [0, 12]
22
  model:
23
  model:
24
  path: mllm-dev/gpt2_f_experiment_2
25
  parameters:
26
- density: 0.9
27
  weight: 0.1
28
  - layer_range: [0, 12]
29
  model:
30
  model:
31
  path: mllm-dev/gpt2_f_experiment_3
32
  parameters:
33
- density: 0.9
34
  weight: 0.1
35
  - layer_range: [0, 12]
36
  model:
37
  model:
38
  path: mllm-dev/gpt2_f_experiment_4
39
  parameters:
40
- density: 0.9
41
  weight: 0.1
42
  - layer_range: [0, 12]
43
  model:
44
  model:
45
  path: mllm-dev/gpt2_f_experiment_5
46
  parameters:
47
- density: 0.9
48
  weight: 0.1
49
  - layer_range: [0, 12]
50
  model:
51
  model:
52
  path: mllm-dev/gpt2_f_experiment_6
53
  parameters:
54
- density: 0.9
55
  weight: 0.1
56
  - layer_range: [0, 12]
57
  model:
58
  model:
59
  path: mllm-dev/gpt2_f_experiment_7
60
  parameters:
61
- density: 0.9
62
  weight: 0.1
63
  - layer_range: [0, 12]
64
  model:
65
  model:
66
  path: mllm-dev/gpt2_f_experiment_8
67
  parameters:
68
- density: 0.9
69
  weight: 0.1
70
  - layer_range: [0, 12]
71
  model:
72
  model:
73
  path: mllm-dev/gpt2_f_experiment_9
74
  parameters:
75
- density: 0.9
76
  weight: 0.1
 
2
  model:
3
  path: mllm-dev/gpt2_f_experiment_0
4
  dtype: float16
5
+ merge_method: dare_linear
6
  parameters:
7
  normalize: 1.0
8
  slices:
 
11
  model:
12
  model:
13
  path: mllm-dev/gpt2_f_experiment_0
14
+ parameters:
15
+ weight: 0.1
16
  - layer_range: [0, 12]
17
  model:
18
  model:
19
  path: mllm-dev/gpt2_f_experiment_1
20
  parameters:
 
21
  weight: 0.1
22
  - layer_range: [0, 12]
23
  model:
24
  model:
25
  path: mllm-dev/gpt2_f_experiment_2
26
  parameters:
 
27
  weight: 0.1
28
  - layer_range: [0, 12]
29
  model:
30
  model:
31
  path: mllm-dev/gpt2_f_experiment_3
32
  parameters:
 
33
  weight: 0.1
34
  - layer_range: [0, 12]
35
  model:
36
  model:
37
  path: mllm-dev/gpt2_f_experiment_4
38
  parameters:
 
39
  weight: 0.1
40
  - layer_range: [0, 12]
41
  model:
42
  model:
43
  path: mllm-dev/gpt2_f_experiment_5
44
  parameters:
 
45
  weight: 0.1
46
  - layer_range: [0, 12]
47
  model:
48
  model:
49
  path: mllm-dev/gpt2_f_experiment_6
50
  parameters:
 
51
  weight: 0.1
52
  - layer_range: [0, 12]
53
  model:
54
  model:
55
  path: mllm-dev/gpt2_f_experiment_7
56
  parameters:
 
57
  weight: 0.1
58
  - layer_range: [0, 12]
59
  model:
60
  model:
61
  path: mllm-dev/gpt2_f_experiment_8
62
  parameters:
 
63
  weight: 0.1
64
  - layer_range: [0, 12]
65
  model:
66
  model:
67
  path: mllm-dev/gpt2_f_experiment_9
68
  parameters:
 
69
  weight: 0.1
model-00001-of-00001.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ac2d223e322da400f17b562a754f1a096a848b97d0f1bc42731b7a370dc0d4ef
3
  size 248902264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:199bbba2ae7eac40e101bc7f4b828cbabdac8e8e3c135fb75d74e380fc957107
3
  size 248902264