mllm-dev commited on
Commit
c0c0028
·
verified ·
1 Parent(s): fd99836

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. README.md +34 -23
  2. config.json +1 -1
  3. mergekit_config.yml +24 -12
  4. model-00001-of-00001.safetensors +1 -1
README.md CHANGED
@@ -1,14 +1,14 @@
1
  ---
2
  base_model:
3
- - mllm-dev/gpt2_f_experiment_1
 
4
  - mllm-dev/gpt2_f_experiment_5
5
- - mllm-dev/gpt2_f_experiment_0
6
  - mllm-dev/gpt2_f_experiment_7
7
  - mllm-dev/gpt2_f_experiment_9
8
- - mllm-dev/gpt2_f_experiment_2
9
- - mllm-dev/gpt2_f_experiment_3
10
  - mllm-dev/gpt2_f_experiment_8
11
- - mllm-dev/gpt2_f_experiment_4
 
12
  - mllm-dev/gpt2_f_experiment_6
13
  library_name: transformers
14
  tags:
@@ -23,20 +23,19 @@ This is a merge of pre-trained language models created using [mergekit](https://
23
  ## Merge Details
24
  ### Merge Method
25
 
26
- This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method.
27
 
28
  ### Models Merged
29
 
30
  The following models were included in the merge:
31
- * [mllm-dev/gpt2_f_experiment_1](https://huggingface.co/mllm-dev/gpt2_f_experiment_1)
 
32
  * [mllm-dev/gpt2_f_experiment_5](https://huggingface.co/mllm-dev/gpt2_f_experiment_5)
33
- * [mllm-dev/gpt2_f_experiment_0](https://huggingface.co/mllm-dev/gpt2_f_experiment_0)
34
  * [mllm-dev/gpt2_f_experiment_7](https://huggingface.co/mllm-dev/gpt2_f_experiment_7)
35
  * [mllm-dev/gpt2_f_experiment_9](https://huggingface.co/mllm-dev/gpt2_f_experiment_9)
36
- * [mllm-dev/gpt2_f_experiment_2](https://huggingface.co/mllm-dev/gpt2_f_experiment_2)
37
- * [mllm-dev/gpt2_f_experiment_3](https://huggingface.co/mllm-dev/gpt2_f_experiment_3)
38
  * [mllm-dev/gpt2_f_experiment_8](https://huggingface.co/mllm-dev/gpt2_f_experiment_8)
39
- * [mllm-dev/gpt2_f_experiment_4](https://huggingface.co/mllm-dev/gpt2_f_experiment_4)
40
  * [mllm-dev/gpt2_f_experiment_6](https://huggingface.co/mllm-dev/gpt2_f_experiment_6)
41
 
42
  ### Configuration
@@ -44,68 +43,80 @@ The following models were included in the merge:
44
  The following YAML configuration was used to produce this model:
45
 
46
  ```yaml
 
 
 
47
  dtype: float16
48
- merge_method: linear
 
 
49
  slices:
50
  - sources:
51
  - layer_range: [0, 12]
52
  model:
53
  model:
54
  path: mllm-dev/gpt2_f_experiment_0
55
- parameters:
56
- weight: 1.0
57
  - layer_range: [0, 12]
58
  model:
59
  model:
60
  path: mllm-dev/gpt2_f_experiment_1
61
  parameters:
62
- weight: 1.0
 
63
  - layer_range: [0, 12]
64
  model:
65
  model:
66
  path: mllm-dev/gpt2_f_experiment_2
67
  parameters:
68
- weight: 1.0
 
69
  - layer_range: [0, 12]
70
  model:
71
  model:
72
  path: mllm-dev/gpt2_f_experiment_3
73
  parameters:
74
- weight: 1.0
 
75
  - layer_range: [0, 12]
76
  model:
77
  model:
78
  path: mllm-dev/gpt2_f_experiment_4
79
  parameters:
80
- weight: 1.0
 
81
  - layer_range: [0, 12]
82
  model:
83
  model:
84
  path: mllm-dev/gpt2_f_experiment_5
85
  parameters:
86
- weight: 1.0
 
87
  - layer_range: [0, 12]
88
  model:
89
  model:
90
  path: mllm-dev/gpt2_f_experiment_6
91
  parameters:
92
- weight: 1.0
 
93
  - layer_range: [0, 12]
94
  model:
95
  model:
96
  path: mllm-dev/gpt2_f_experiment_7
97
  parameters:
98
- weight: 1.0
 
99
  - layer_range: [0, 12]
100
  model:
101
  model:
102
  path: mllm-dev/gpt2_f_experiment_8
103
  parameters:
104
- weight: 1.0
 
105
  - layer_range: [0, 12]
106
  model:
107
  model:
108
  path: mllm-dev/gpt2_f_experiment_9
109
  parameters:
110
- weight: 1.0
 
111
  ```
 
1
  ---
2
  base_model:
3
+ - mllm-dev/gpt2_f_experiment_2
4
+ - mllm-dev/gpt2_f_experiment_4
5
  - mllm-dev/gpt2_f_experiment_5
 
6
  - mllm-dev/gpt2_f_experiment_7
7
  - mllm-dev/gpt2_f_experiment_9
8
+ - mllm-dev/gpt2_f_experiment_1
 
9
  - mllm-dev/gpt2_f_experiment_8
10
+ - mllm-dev/gpt2_f_experiment_3
11
+ - mllm-dev/gpt2_f_experiment_0
12
  - mllm-dev/gpt2_f_experiment_6
13
  library_name: transformers
14
  tags:
 
23
  ## Merge Details
24
  ### Merge Method
25
 
26
+ This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [mllm-dev/gpt2_f_experiment_0](https://huggingface.co/mllm-dev/gpt2_f_experiment_0) as a base.
27
 
28
  ### Models Merged
29
 
30
  The following models were included in the merge:
31
+ * [mllm-dev/gpt2_f_experiment_2](https://huggingface.co/mllm-dev/gpt2_f_experiment_2)
32
+ * [mllm-dev/gpt2_f_experiment_4](https://huggingface.co/mllm-dev/gpt2_f_experiment_4)
33
  * [mllm-dev/gpt2_f_experiment_5](https://huggingface.co/mllm-dev/gpt2_f_experiment_5)
 
34
  * [mllm-dev/gpt2_f_experiment_7](https://huggingface.co/mllm-dev/gpt2_f_experiment_7)
35
  * [mllm-dev/gpt2_f_experiment_9](https://huggingface.co/mllm-dev/gpt2_f_experiment_9)
36
+ * [mllm-dev/gpt2_f_experiment_1](https://huggingface.co/mllm-dev/gpt2_f_experiment_1)
 
37
  * [mllm-dev/gpt2_f_experiment_8](https://huggingface.co/mllm-dev/gpt2_f_experiment_8)
38
+ * [mllm-dev/gpt2_f_experiment_3](https://huggingface.co/mllm-dev/gpt2_f_experiment_3)
39
  * [mllm-dev/gpt2_f_experiment_6](https://huggingface.co/mllm-dev/gpt2_f_experiment_6)
40
 
41
  ### Configuration
 
43
  The following YAML configuration was used to produce this model:
44
 
45
  ```yaml
46
+ base_model:
47
+ model:
48
+ path: mllm-dev/gpt2_f_experiment_0
49
  dtype: float16
50
+ merge_method: ties
51
+ parameters:
52
+ normalize: 1.0
53
  slices:
54
  - sources:
55
  - layer_range: [0, 12]
56
  model:
57
  model:
58
  path: mllm-dev/gpt2_f_experiment_0
 
 
59
  - layer_range: [0, 12]
60
  model:
61
  model:
62
  path: mllm-dev/gpt2_f_experiment_1
63
  parameters:
64
+ density: 0.9
65
+ weight: 0.1
66
  - layer_range: [0, 12]
67
  model:
68
  model:
69
  path: mllm-dev/gpt2_f_experiment_2
70
  parameters:
71
+ density: 0.9
72
+ weight: 0.1
73
  - layer_range: [0, 12]
74
  model:
75
  model:
76
  path: mllm-dev/gpt2_f_experiment_3
77
  parameters:
78
+ density: 0.9
79
+ weight: 0.1
80
  - layer_range: [0, 12]
81
  model:
82
  model:
83
  path: mllm-dev/gpt2_f_experiment_4
84
  parameters:
85
+ density: 0.9
86
+ weight: 0.1
87
  - layer_range: [0, 12]
88
  model:
89
  model:
90
  path: mllm-dev/gpt2_f_experiment_5
91
  parameters:
92
+ density: 0.9
93
+ weight: 0.1
94
  - layer_range: [0, 12]
95
  model:
96
  model:
97
  path: mllm-dev/gpt2_f_experiment_6
98
  parameters:
99
+ density: 0.9
100
+ weight: 0.1
101
  - layer_range: [0, 12]
102
  model:
103
  model:
104
  path: mllm-dev/gpt2_f_experiment_7
105
  parameters:
106
+ density: 0.9
107
+ weight: 0.1
108
  - layer_range: [0, 12]
109
  model:
110
  model:
111
  path: mllm-dev/gpt2_f_experiment_8
112
  parameters:
113
+ density: 0.9
114
+ weight: 0.1
115
  - layer_range: [0, 12]
116
  model:
117
  model:
118
  path: mllm-dev/gpt2_f_experiment_9
119
  parameters:
120
+ density: 0.9
121
+ weight: 0.1
122
  ```
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "mllm-dev/gpt2_f_experiment_1",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2ForSequenceClassification"
 
1
  {
2
+ "_name_or_path": "mllm-dev/gpt2_f_experiment_0",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2ForSequenceClassification"
mergekit_config.yml CHANGED
@@ -1,64 +1,76 @@
 
 
 
1
  dtype: float16
2
- merge_method: linear
 
 
3
  slices:
4
  - sources:
5
  - layer_range: [0, 12]
6
  model:
7
  model:
8
  path: mllm-dev/gpt2_f_experiment_0
9
- parameters:
10
- weight: 1.0
11
  - layer_range: [0, 12]
12
  model:
13
  model:
14
  path: mllm-dev/gpt2_f_experiment_1
15
  parameters:
16
- weight: 1.0
 
17
  - layer_range: [0, 12]
18
  model:
19
  model:
20
  path: mllm-dev/gpt2_f_experiment_2
21
  parameters:
22
- weight: 1.0
 
23
  - layer_range: [0, 12]
24
  model:
25
  model:
26
  path: mllm-dev/gpt2_f_experiment_3
27
  parameters:
28
- weight: 1.0
 
29
  - layer_range: [0, 12]
30
  model:
31
  model:
32
  path: mllm-dev/gpt2_f_experiment_4
33
  parameters:
34
- weight: 1.0
 
35
  - layer_range: [0, 12]
36
  model:
37
  model:
38
  path: mllm-dev/gpt2_f_experiment_5
39
  parameters:
40
- weight: 1.0
 
41
  - layer_range: [0, 12]
42
  model:
43
  model:
44
  path: mllm-dev/gpt2_f_experiment_6
45
  parameters:
46
- weight: 1.0
 
47
  - layer_range: [0, 12]
48
  model:
49
  model:
50
  path: mllm-dev/gpt2_f_experiment_7
51
  parameters:
52
- weight: 1.0
 
53
  - layer_range: [0, 12]
54
  model:
55
  model:
56
  path: mllm-dev/gpt2_f_experiment_8
57
  parameters:
58
- weight: 1.0
 
59
  - layer_range: [0, 12]
60
  model:
61
  model:
62
  path: mllm-dev/gpt2_f_experiment_9
63
  parameters:
64
- weight: 1.0
 
 
1
+ base_model:
2
+ model:
3
+ path: mllm-dev/gpt2_f_experiment_0
4
  dtype: float16
5
+ merge_method: ties
6
+ parameters:
7
+ normalize: 1.0
8
  slices:
9
  - sources:
10
  - layer_range: [0, 12]
11
  model:
12
  model:
13
  path: mllm-dev/gpt2_f_experiment_0
 
 
14
  - layer_range: [0, 12]
15
  model:
16
  model:
17
  path: mllm-dev/gpt2_f_experiment_1
18
  parameters:
19
+ density: 0.9
20
+ weight: 0.1
21
  - layer_range: [0, 12]
22
  model:
23
  model:
24
  path: mllm-dev/gpt2_f_experiment_2
25
  parameters:
26
+ density: 0.9
27
+ weight: 0.1
28
  - layer_range: [0, 12]
29
  model:
30
  model:
31
  path: mllm-dev/gpt2_f_experiment_3
32
  parameters:
33
+ density: 0.9
34
+ weight: 0.1
35
  - layer_range: [0, 12]
36
  model:
37
  model:
38
  path: mllm-dev/gpt2_f_experiment_4
39
  parameters:
40
+ density: 0.9
41
+ weight: 0.1
42
  - layer_range: [0, 12]
43
  model:
44
  model:
45
  path: mllm-dev/gpt2_f_experiment_5
46
  parameters:
47
+ density: 0.9
48
+ weight: 0.1
49
  - layer_range: [0, 12]
50
  model:
51
  model:
52
  path: mllm-dev/gpt2_f_experiment_6
53
  parameters:
54
+ density: 0.9
55
+ weight: 0.1
56
  - layer_range: [0, 12]
57
  model:
58
  model:
59
  path: mllm-dev/gpt2_f_experiment_7
60
  parameters:
61
+ density: 0.9
62
+ weight: 0.1
63
  - layer_range: [0, 12]
64
  model:
65
  model:
66
  path: mllm-dev/gpt2_f_experiment_8
67
  parameters:
68
+ density: 0.9
69
+ weight: 0.1
70
  - layer_range: [0, 12]
71
  model:
72
  model:
73
  path: mllm-dev/gpt2_f_experiment_9
74
  parameters:
75
+ density: 0.9
76
+ weight: 0.1
model-00001-of-00001.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4bcf9cb71e14431636c4e43fda6cda1a4e795f63a07b9fef99febec161914707
3
  size 248902264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ac2d223e322da400f17b562a754f1a096a848b97d0f1bc42731b7a370dc0d4ef
3
  size 248902264