mllm-dev commited on
Commit
b9213ab
·
verified ·
1 Parent(s): 7fdeda9

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
  base_model:
3
- - mllm-dev/gpt2_f_experiment_2_drug_data_new_run
4
  - mllm-dev/gpt2_f_experiment_4_drug_data_new_run
 
5
  - mllm-dev/gpt2_f_experiment_1_drug_data_new_run
6
- - mllm-dev/gpt2_f_experiment_0_drug_data_new_run
7
  - mllm-dev/gpt2_f_experiment_3_drug_data_new_run
8
  library_name: transformers
9
  tags:
@@ -11,22 +11,21 @@ tags:
11
  - merge
12
 
13
  ---
14
- # tam_test_merge_out_drug_data_linear_test_new_run
15
 
16
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
17
 
18
  ## Merge Details
19
  ### Merge Method
20
 
21
- This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method.
22
 
23
  ### Models Merged
24
 
25
  The following models were included in the merge:
26
- * [mllm-dev/gpt2_f_experiment_2_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_2_drug_data_new_run)
27
  * [mllm-dev/gpt2_f_experiment_4_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_4_drug_data_new_run)
 
28
  * [mllm-dev/gpt2_f_experiment_1_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_1_drug_data_new_run)
29
- * [mllm-dev/gpt2_f_experiment_0_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_0_drug_data_new_run)
30
  * [mllm-dev/gpt2_f_experiment_3_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_3_drug_data_new_run)
31
 
32
  ### Configuration
@@ -34,28 +33,31 @@ The following models were included in the merge:
34
  The following YAML configuration was used to produce this model:
35
 
36
  ```yaml
 
37
  dtype: float16
38
- merge_method: linear
 
 
39
  slices:
40
  - sources:
41
  - layer_range: [0, 12]
42
  model: mllm-dev/gpt2_f_experiment_0_drug_data_new_run
43
  parameters:
44
- weight: 1.0
45
  - layer_range: [0, 12]
46
  model: mllm-dev/gpt2_f_experiment_1_drug_data_new_run
47
  parameters:
48
- weight: 1.0
49
  - layer_range: [0, 12]
50
  model: mllm-dev/gpt2_f_experiment_2_drug_data_new_run
51
  parameters:
52
- weight: 1.0
53
  - layer_range: [0, 12]
54
  model: mllm-dev/gpt2_f_experiment_3_drug_data_new_run
55
  parameters:
56
- weight: 1.0
57
  - layer_range: [0, 12]
58
  model: mllm-dev/gpt2_f_experiment_4_drug_data_new_run
59
  parameters:
60
- weight: 1.0
61
  ```
 
1
  ---
2
  base_model:
3
+ - mllm-dev/gpt2_f_experiment_0_drug_data_new_run
4
  - mllm-dev/gpt2_f_experiment_4_drug_data_new_run
5
+ - mllm-dev/gpt2_f_experiment_2_drug_data_new_run
6
  - mllm-dev/gpt2_f_experiment_1_drug_data_new_run
 
7
  - mllm-dev/gpt2_f_experiment_3_drug_data_new_run
8
  library_name: transformers
9
  tags:
 
11
  - merge
12
 
13
  ---
14
+ # tam_test_merge_out_drug_data_dare_linear_test_new_run
15
 
16
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
17
 
18
  ## Merge Details
19
  ### Merge Method
20
 
21
+ This model was merged using the linear [DARE](https://arxiv.org/abs/2311.03099) merge method using [mllm-dev/gpt2_f_experiment_0_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_0_drug_data_new_run) as a base.
22
 
23
  ### Models Merged
24
 
25
  The following models were included in the merge:
 
26
  * [mllm-dev/gpt2_f_experiment_4_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_4_drug_data_new_run)
27
+ * [mllm-dev/gpt2_f_experiment_2_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_2_drug_data_new_run)
28
  * [mllm-dev/gpt2_f_experiment_1_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_1_drug_data_new_run)
 
29
  * [mllm-dev/gpt2_f_experiment_3_drug_data_new_run](https://huggingface.co/mllm-dev/gpt2_f_experiment_3_drug_data_new_run)
30
 
31
  ### Configuration
 
33
  The following YAML configuration was used to produce this model:
34
 
35
  ```yaml
36
+ base_model: mllm-dev/gpt2_f_experiment_0_drug_data_new_run
37
  dtype: float16
38
+ merge_method: dare_linear
39
+ parameters:
40
+ normalize: 1.0
41
  slices:
42
  - sources:
43
  - layer_range: [0, 12]
44
  model: mllm-dev/gpt2_f_experiment_0_drug_data_new_run
45
  parameters:
46
+ weight: 0.2
47
  - layer_range: [0, 12]
48
  model: mllm-dev/gpt2_f_experiment_1_drug_data_new_run
49
  parameters:
50
+ weight: 0.2
51
  - layer_range: [0, 12]
52
  model: mllm-dev/gpt2_f_experiment_2_drug_data_new_run
53
  parameters:
54
+ weight: 0.2
55
  - layer_range: [0, 12]
56
  model: mllm-dev/gpt2_f_experiment_3_drug_data_new_run
57
  parameters:
58
+ weight: 0.2
59
  - layer_range: [0, 12]
60
  model: mllm-dev/gpt2_f_experiment_4_drug_data_new_run
61
  parameters:
62
+ weight: 0.2
63
  ```
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "mllm-dev/gpt2_f_experiment_2_drug_data_new_run",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2ForSequenceClassification"
 
1
  {
2
+ "_name_or_path": "mllm-dev/gpt2_f_experiment_0_drug_data_new_run",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2ForSequenceClassification"
mergekit_config.yml CHANGED
@@ -1,24 +1,27 @@
 
1
  dtype: float16
2
- merge_method: linear
 
 
3
  slices:
4
  - sources:
5
  - layer_range: [0, 12]
6
  model: mllm-dev/gpt2_f_experiment_0_drug_data_new_run
7
  parameters:
8
- weight: 1.0
9
  - layer_range: [0, 12]
10
  model: mllm-dev/gpt2_f_experiment_1_drug_data_new_run
11
  parameters:
12
- weight: 1.0
13
  - layer_range: [0, 12]
14
  model: mllm-dev/gpt2_f_experiment_2_drug_data_new_run
15
  parameters:
16
- weight: 1.0
17
  - layer_range: [0, 12]
18
  model: mllm-dev/gpt2_f_experiment_3_drug_data_new_run
19
  parameters:
20
- weight: 1.0
21
  - layer_range: [0, 12]
22
  model: mllm-dev/gpt2_f_experiment_4_drug_data_new_run
23
  parameters:
24
- weight: 1.0
 
1
+ base_model: mllm-dev/gpt2_f_experiment_0_drug_data_new_run
2
  dtype: float16
3
+ merge_method: dare_linear
4
+ parameters:
5
+ normalize: 1.0
6
  slices:
7
  - sources:
8
  - layer_range: [0, 12]
9
  model: mllm-dev/gpt2_f_experiment_0_drug_data_new_run
10
  parameters:
11
+ weight: 0.2
12
  - layer_range: [0, 12]
13
  model: mllm-dev/gpt2_f_experiment_1_drug_data_new_run
14
  parameters:
15
+ weight: 0.2
16
  - layer_range: [0, 12]
17
  model: mllm-dev/gpt2_f_experiment_2_drug_data_new_run
18
  parameters:
19
+ weight: 0.2
20
  - layer_range: [0, 12]
21
  model: mllm-dev/gpt2_f_experiment_3_drug_data_new_run
22
  parameters:
23
+ weight: 0.2
24
  - layer_range: [0, 12]
25
  model: mllm-dev/gpt2_f_experiment_4_drug_data_new_run
26
  parameters:
27
+ weight: 0.2
model-00001-of-00001.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:45d2c76b0cd7575720adb18f45b7b7d18ba048a4e78c89a7382364eb5c92c58b
3
  size 248909944
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66898369bbead1e332e534f5a64b4d3f23601a74d6384ee152e68560e680fc91
3
  size 248909944