lejelly commited on
Commit
27f872c
·
verified ·
1 Parent(s): 3f485d6

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - merge
4
+ - parameter_wise
5
+ - llm-adamerge
6
+ base_model: mistralai/Mistral-7B-v0.1
7
+ ---
8
+
9
+ # Merged Model using LLM-AdaMerge (parameter_wise)
10
+
11
+ This model was created by merging multiple fine-tuned models using the LLM-AdaMerge approach with parameter_wise merging.
12
+
13
+ ## Merge Details
14
+
15
+ - **Merge Type**: parameter_wise
16
+ - **Base Model**: mistralai/Mistral-7B-v0.1
17
+ - **Number of Models Merged**: 3
18
+ - **Models Merged**: instruct, math, code
19
+ - **Final Training Loss**: N/A
20
+ - **Training Epochs**: 0
21
+
22
+ ## Lambda Coefficients
23
+
24
+ The following lambda coefficients were learned during training:
25
+
26
+
27
+ ### Parameter-wise Lambdas
28
+ This model uses parameter-wise lambda coefficients. Total parameters with individual lambdas: 291
29
+
30
+
31
+ See the uploaded `learned_lambdas.json` file for detailed parameter-wise coefficients.
32
+
33
+ ## Usage
34
+
35
+ ```python
36
+ from transformers import AutoModelForCausalLM, AutoTokenizer
37
+
38
+ model = AutoModelForCausalLM.from_pretrained("your-username/model-name")
39
+ tokenizer = AutoTokenizer.from_pretrained("your-username/model-name")
40
+
41
+ # Use the model
42
+ inputs = tokenizer("Hello, how are you?", return_tensors="pt")
43
+ outputs = model.generate(**inputs)
44
+ print(tokenizer.decode(outputs[0]))
45
+ ```
46
+
47
+ ## Training Configuration
48
+
49
+ See the uploaded `training_config.json` file for detailed training configuration.
50
+
51
+ ## Citation
52
+
53
+ If you use this model, please cite the LLM-AdaMerge paper:
54
+
55
+ ```bibtex
56
+ @article{llmadamerge2024,
57
+ title={LLM-AdaMerge: Adaptive Model Merging for Large Language Models},
58
+ author={...},
59
+ year={2024}
60
+ }
61
+ ```
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "MistralForCausalLM"
4
+ ],
5
+ "attention_dropout": 0.0,
6
+ "bos_token_id": 1,
7
+ "eos_token_id": 2,
8
+ "head_dim": null,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 4096,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 14336,
13
+ "max_position_embeddings": 32768,
14
+ "model_type": "mistral",
15
+ "num_attention_heads": 32,
16
+ "num_hidden_layers": 32,
17
+ "num_key_value_heads": 8,
18
+ "rms_norm_eps": 1e-05,
19
+ "rope_theta": 10000.0,
20
+ "sliding_window": 4096,
21
+ "tie_word_embeddings": false,
22
+ "torch_dtype": "float16",
23
+ "transformers_version": "4.52.4",
24
+ "use_cache": true,
25
+ "vocab_size": 32000
26
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "transformers_version": "4.52.4"
6
+ }
learned_lambdas.json ADDED
@@ -0,0 +1,1759 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "lambdas": [
3
+ [
4
+ 0.16208726167678833,
5
+ 0.24977269768714905,
6
+ 0.38571128249168396
7
+ ],
8
+ [
9
+ 0.18331214785575867,
10
+ 0.14169958233833313,
11
+ 0.16920897364616394
12
+ ],
13
+ [
14
+ 0.21494106948375702,
15
+ 0.16289786994457245,
16
+ 0.16245917975902557
17
+ ],
18
+ [
19
+ 0.4709009528160095,
20
+ 0.236475870013237,
21
+ 0.32869166135787964
22
+ ],
23
+ [
24
+ 0.40938660502433777,
25
+ 0.17369914054870605,
26
+ 0.43705955147743225
27
+ ],
28
+ [
29
+ 0.1679387241601944,
30
+ 0.3251589238643646,
31
+ 0.24202805757522583
32
+ ],
33
+ [
34
+ 0.45976540446281433,
35
+ 0.423393189907074,
36
+ 0.3613768517971039
37
+ ],
38
+ [
39
+ 0.43341588973999023,
40
+ 0.4040796458721161,
41
+ 0.3114054203033447
42
+ ],
43
+ [
44
+ 0.17522744834423065,
45
+ 0.18705594539642334,
46
+ 0.42746734619140625
47
+ ],
48
+ [
49
+ 0.3566037118434906,
50
+ 0.22710564732551575,
51
+ 0.23650182783603668
52
+ ],
53
+ [
54
+ 0.2046629637479782,
55
+ 0.18631888926029205,
56
+ 0.4094899296760559
57
+ ],
58
+ [
59
+ 0.41665905714035034,
60
+ 0.3926244080066681,
61
+ 0.3993978798389435
62
+ ],
63
+ [
64
+ 0.17062801122665405,
65
+ 0.27811920642852783,
66
+ 0.18844841420650482
67
+ ],
68
+ [
69
+ 0.19141781330108643,
70
+ 0.4501112401485443,
71
+ 0.43065306544303894
72
+ ],
73
+ [
74
+ 0.29432618618011475,
75
+ 0.1311386525630951,
76
+ 0.1622077226638794
77
+ ],
78
+ [
79
+ 0.3353589177131653,
80
+ 0.1841006577014923,
81
+ 0.37598785758018494
82
+ ],
83
+ [
84
+ 0.5245965719223022,
85
+ 0.4587164521217346,
86
+ 0.5183042883872986
87
+ ],
88
+ [
89
+ 0.36892610788345337,
90
+ 0.4359264075756073,
91
+ 0.23694349825382233
92
+ ],
93
+ [
94
+ 0.16926755011081696,
95
+ 0.17121516168117523,
96
+ 0.43211275339126587
97
+ ],
98
+ [
99
+ 0.3100939095020294,
100
+ 0.302481085062027,
101
+ 0.26817384362220764
102
+ ],
103
+ [
104
+ 0.3639610707759857,
105
+ 0.25674009323120117,
106
+ 0.29938480257987976
107
+ ],
108
+ [
109
+ 0.38073480129241943,
110
+ 0.4911634922027588,
111
+ 0.15040366351604462
112
+ ],
113
+ [
114
+ 0.23866873979568481,
115
+ 0.2579415738582611,
116
+ 0.49089688062667847
117
+ ],
118
+ [
119
+ 0.4393746852874756,
120
+ 0.4316970705986023,
121
+ 0.43425291776657104
122
+ ],
123
+ [
124
+ 0.4066869914531708,
125
+ 0.3960731029510498,
126
+ 0.3638031482696533
127
+ ],
128
+ [
129
+ 0.4893644452095032,
130
+ 0.13532483577728271,
131
+ 0.41857850551605225
132
+ ],
133
+ [
134
+ 0.25284385681152344,
135
+ 0.4052741825580597,
136
+ 0.18129244446754456
137
+ ],
138
+ [
139
+ 0.46452370285987854,
140
+ 0.372437059879303,
141
+ 0.2559366822242737
142
+ ],
143
+ [
144
+ 0.21490252017974854,
145
+ 0.2653629779815674,
146
+ 0.2889968156814575
147
+ ],
148
+ [
149
+ 0.252851277589798,
150
+ 0.27955886721611023,
151
+ 0.32882392406463623
152
+ ],
153
+ [
154
+ 0.18347841501235962,
155
+ 0.45960476994514465,
156
+ 0.26251620054244995
157
+ ],
158
+ [
159
+ 0.24115559458732605,
160
+ 0.44355976581573486,
161
+ 0.49574416875839233
162
+ ],
163
+ [
164
+ 0.49159127473831177,
165
+ 0.27259331941604614,
166
+ 0.4479062259197235
167
+ ],
168
+ [
169
+ 0.36666128039360046,
170
+ 0.14351864159107208,
171
+ 0.267185777425766
172
+ ],
173
+ [
174
+ 0.4453156292438507,
175
+ 0.2975822985172272,
176
+ 0.45851752161979675
177
+ ],
178
+ [
179
+ 0.1668105125427246,
180
+ 0.4410073757171631,
181
+ 0.459927499294281
182
+ ],
183
+ [
184
+ 0.24815340340137482,
185
+ 0.3002164661884308,
186
+ 0.37557774782180786
187
+ ],
188
+ [
189
+ 0.3665900230407715,
190
+ 0.18047991394996643,
191
+ 0.220330148935318
192
+ ],
193
+ [
194
+ 0.33209168910980225,
195
+ 0.19305254518985748,
196
+ 0.2052406221628189
197
+ ],
198
+ [
199
+ 0.22123198211193085,
200
+ 0.4648298919200897,
201
+ 0.33412012457847595
202
+ ],
203
+ [
204
+ 0.13678866624832153,
205
+ 0.39626917243003845,
206
+ 0.43695369362831116
207
+ ],
208
+ [
209
+ 0.30948567390441895,
210
+ 0.4401801824569702,
211
+ 0.46759605407714844
212
+ ],
213
+ [
214
+ 0.3703366219997406,
215
+ 0.15897679328918457,
216
+ 0.3335103690624237
217
+ ],
218
+ [
219
+ 0.426249623298645,
220
+ 0.44333672523498535,
221
+ 0.25532713532447815
222
+ ],
223
+ [
224
+ 0.2961595058441162,
225
+ 0.34248918294906616,
226
+ 0.1680079996585846
227
+ ],
228
+ [
229
+ 0.4402754306793213,
230
+ 0.4730893671512604,
231
+ 0.2684214115142822
232
+ ],
233
+ [
234
+ 0.36461928486824036,
235
+ 0.14204590022563934,
236
+ 0.12997153401374817
237
+ ],
238
+ [
239
+ 0.22300171852111816,
240
+ 0.12940312922000885,
241
+ 0.1308601200580597
242
+ ],
243
+ [
244
+ 0.31907451152801514,
245
+ 0.36181750893592834,
246
+ 0.34903863072395325
247
+ ],
248
+ [
249
+ 0.32538342475891113,
250
+ 0.31537678837776184,
251
+ 0.44149842858314514
252
+ ],
253
+ [
254
+ 0.24038256704807281,
255
+ 0.4448171854019165,
256
+ 0.31282907724380493
257
+ ],
258
+ [
259
+ 0.3276941478252411,
260
+ 0.4581640660762787,
261
+ 0.47564896941185
262
+ ],
263
+ [
264
+ 0.49468788504600525,
265
+ 0.4251101315021515,
266
+ 0.4716058075428009
267
+ ],
268
+ [
269
+ 0.17445886135101318,
270
+ 0.41030916571617126,
271
+ 0.39817726612091064
272
+ ],
273
+ [
274
+ 0.1487571895122528,
275
+ 0.1909407675266266,
276
+ 0.32446837425231934
277
+ ],
278
+ [
279
+ 0.2804926633834839,
280
+ 0.2289332002401352,
281
+ 0.3335956037044525
282
+ ],
283
+ [
284
+ 0.4198838472366333,
285
+ 0.35879606008529663,
286
+ 0.34814944863319397
287
+ ],
288
+ [
289
+ 0.1894579976797104,
290
+ 0.4814528226852417,
291
+ 0.37426260113716125
292
+ ],
293
+ [
294
+ 0.29289743304252625,
295
+ 0.3062926232814789,
296
+ 0.45050784945487976
297
+ ],
298
+ [
299
+ 0.15672793984413147,
300
+ 0.08559931069612503,
301
+ 0.47559866309165955
302
+ ],
303
+ [
304
+ 0.41348713636398315,
305
+ 0.35709860920906067,
306
+ 0.5017322897911072
307
+ ],
308
+ [
309
+ 0.2880399227142334,
310
+ 0.2846962809562683,
311
+ 0.5151503086090088
312
+ ],
313
+ [
314
+ 0.29666125774383545,
315
+ 0.13564036786556244,
316
+ 0.3099544644355774
317
+ ],
318
+ [
319
+ 0.44351011514663696,
320
+ 0.3564918637275696,
321
+ 0.41679856181144714
322
+ ],
323
+ [
324
+ 0.36431220173835754,
325
+ 0.2516529858112335,
326
+ 0.2628558278083801
327
+ ],
328
+ [
329
+ 0.44545629620552063,
330
+ 0.3760414719581604,
331
+ 0.32543355226516724
332
+ ],
333
+ [
334
+ 0.2067258208990097,
335
+ 0.19939331710338593,
336
+ 0.2972441017627716
337
+ ],
338
+ [
339
+ 0.4336257576942444,
340
+ 0.39274734258651733,
341
+ 0.22342611849308014
342
+ ],
343
+ [
344
+ 0.40564268827438354,
345
+ 0.2527061402797699,
346
+ 0.4891509711742401
347
+ ],
348
+ [
349
+ 0.2352115511894226,
350
+ 0.15381625294685364,
351
+ 0.46002259850502014
352
+ ],
353
+ [
354
+ 0.38463202118873596,
355
+ 0.24447672069072723,
356
+ 0.47498297691345215
357
+ ],
358
+ [
359
+ 0.29125523567199707,
360
+ 0.2991064488887787,
361
+ 0.42586034536361694
362
+ ],
363
+ [
364
+ 0.5188958048820496,
365
+ 0.342620849609375,
366
+ 0.15488815307617188
367
+ ],
368
+ [
369
+ 0.31807637214660645,
370
+ 0.2687262296676636,
371
+ 0.2913946211338043
372
+ ],
373
+ [
374
+ 0.35557687282562256,
375
+ 0.26688581705093384,
376
+ 0.29228726029396057
377
+ ],
378
+ [
379
+ 0.18164688348770142,
380
+ 0.2901536822319031,
381
+ 0.2333170771598816
382
+ ],
383
+ [
384
+ 0.4140317142009735,
385
+ 0.4553394615650177,
386
+ 0.4394652843475342
387
+ ],
388
+ [
389
+ 0.4152527451515198,
390
+ 0.2562987208366394,
391
+ 0.33980560302734375
392
+ ],
393
+ [
394
+ 0.47941330075263977,
395
+ 0.42565715312957764,
396
+ 0.5127143859863281
397
+ ],
398
+ [
399
+ 0.4543350338935852,
400
+ 0.4203278720378876,
401
+ 0.1748569756746292
402
+ ],
403
+ [
404
+ 0.344078004360199,
405
+ 0.2991064488887787,
406
+ 0.19280372560024261
407
+ ],
408
+ [
409
+ 0.4263242781162262,
410
+ 0.45604971051216125,
411
+ 0.42066264152526855
412
+ ],
413
+ [
414
+ 0.1204405203461647,
415
+ 0.2455279678106308,
416
+ 0.12330766767263412
417
+ ],
418
+ [
419
+ 0.20176061987876892,
420
+ 0.14623914659023285,
421
+ 0.12989316880702972
422
+ ],
423
+ [
424
+ 0.27237942814826965,
425
+ 0.3843976557254791,
426
+ 0.3280620276927948
427
+ ],
428
+ [
429
+ 0.13094814121723175,
430
+ 0.3203805983066559,
431
+ 0.42873477935791016
432
+ ],
433
+ [
434
+ 0.45539844036102295,
435
+ 0.3134565055370331,
436
+ 0.5228918194770813
437
+ ],
438
+ [
439
+ 0.16390842199325562,
440
+ 0.21519719064235687,
441
+ 0.27124568819999695
442
+ ],
443
+ [
444
+ 0.26627862453460693,
445
+ 0.28114721179008484,
446
+ 0.41092804074287415
447
+ ],
448
+ [
449
+ 0.1396704763174057,
450
+ 0.3456311523914337,
451
+ 0.1328422874212265
452
+ ],
453
+ [
454
+ 0.42076611518859863,
455
+ 0.4566251337528229,
456
+ 0.10813107341527939
457
+ ],
458
+ [
459
+ 0.3652276396751404,
460
+ 0.28837719559669495,
461
+ 0.21894817054271698
462
+ ],
463
+ [
464
+ 0.2800031304359436,
465
+ 0.4188965857028961,
466
+ 0.24897190928459167
467
+ ],
468
+ [
469
+ 0.38660547137260437,
470
+ 0.441772997379303,
471
+ 0.20144522190093994
472
+ ],
473
+ [
474
+ 0.40321746468544006,
475
+ 0.3388832211494446,
476
+ 0.328869491815567
477
+ ],
478
+ [
479
+ 0.33195263147354126,
480
+ 0.2801781892776489,
481
+ 0.44968971610069275
482
+ ],
483
+ [
484
+ 0.2033935785293579,
485
+ 0.3041459321975708,
486
+ 0.1943432241678238
487
+ ],
488
+ [
489
+ 0.4254605174064636,
490
+ 0.24439556896686554,
491
+ 0.15095491707324982
492
+ ],
493
+ [
494
+ 0.16120661795139313,
495
+ 0.2991064488887787,
496
+ 0.34336093068122864
497
+ ],
498
+ [
499
+ 0.21314164996147156,
500
+ 0.44942721724510193,
501
+ 0.2526240646839142
502
+ ],
503
+ [
504
+ 0.29220855236053467,
505
+ 0.36686259508132935,
506
+ 0.2187107801437378
507
+ ],
508
+ [
509
+ 0.2738037407398224,
510
+ 0.1877710372209549,
511
+ 0.18895429372787476
512
+ ],
513
+ [
514
+ 0.24792055785655975,
515
+ 0.4028272330760956,
516
+ 0.19411301612854004
517
+ ],
518
+ [
519
+ 0.3488992154598236,
520
+ 0.28811877965927124,
521
+ 0.38409680128097534
522
+ ],
523
+ [
524
+ 0.4132925271987915,
525
+ 0.3965669274330139,
526
+ 0.23063424229621887
527
+ ],
528
+ [
529
+ 0.356996089220047,
530
+ 0.23362047970294952,
531
+ 0.21099519729614258
532
+ ],
533
+ [
534
+ 0.45989128947257996,
535
+ 0.45070698857307434,
536
+ 0.10690866410732269
537
+ ],
538
+ [
539
+ 0.17739583551883698,
540
+ 0.27253568172454834,
541
+ 0.44691747426986694
542
+ ],
543
+ [
544
+ 0.25216418504714966,
545
+ 0.14159707725048065,
546
+ 0.31676843762397766
547
+ ],
548
+ [
549
+ 0.1749688982963562,
550
+ 0.3111892342567444,
551
+ 0.17023824155330658
552
+ ],
553
+ [
554
+ 0.4208482801914215,
555
+ 0.3723095655441284,
556
+ 0.26887425780296326
557
+ ],
558
+ [
559
+ 0.1393832266330719,
560
+ 0.4552495777606964,
561
+ 0.22372686862945557
562
+ ],
563
+ [
564
+ 0.12029849737882614,
565
+ 0.4672866463661194,
566
+ 0.3800401985645294
567
+ ],
568
+ [
569
+ 0.47501662373542786,
570
+ 0.27802082896232605,
571
+ 0.17907863855361938
572
+ ],
573
+ [
574
+ 0.4663488566875458,
575
+ 0.360889196395874,
576
+ 0.4252662658691406
577
+ ],
578
+ [
579
+ 0.42165517807006836,
580
+ 0.19992892444133759,
581
+ 0.4559437930583954
582
+ ],
583
+ [
584
+ 0.48943525552749634,
585
+ 0.19055475294589996,
586
+ 0.21054872870445251
587
+ ],
588
+ [
589
+ 0.14921477437019348,
590
+ 0.17947685718536377,
591
+ 0.35847538709640503
592
+ ],
593
+ [
594
+ 0.2574572265148163,
595
+ 0.41964125633239746,
596
+ 0.3162834942340851
597
+ ],
598
+ [
599
+ 0.3613671362400055,
600
+ 0.46955281496047974,
601
+ 0.46290427446365356
602
+ ],
603
+ [
604
+ 0.3047774136066437,
605
+ 0.1743895411491394,
606
+ 0.35750120878219604
607
+ ],
608
+ [
609
+ 0.21478118002414703,
610
+ 0.28847652673721313,
611
+ 0.2683359384536743
612
+ ],
613
+ [
614
+ 0.2139473855495453,
615
+ 0.1657281070947647,
616
+ 0.14329995214939117
617
+ ],
618
+ [
619
+ 0.22838430106639862,
620
+ 0.40939027070999146,
621
+ 0.49126529693603516
622
+ ],
623
+ [
624
+ 0.399196982383728,
625
+ 0.2132638394832611,
626
+ 0.46450337767601013
627
+ ],
628
+ [
629
+ 0.5129156708717346,
630
+ 0.2991064488887787,
631
+ 0.09842011332511902
632
+ ],
633
+ [
634
+ 0.25175443291664124,
635
+ 0.4604605436325073,
636
+ 0.25655990839004517
637
+ ],
638
+ [
639
+ 0.29164546728134155,
640
+ 0.3701278567314148,
641
+ 0.2827969789505005
642
+ ],
643
+ [
644
+ 0.4632199704647064,
645
+ 0.4301372170448303,
646
+ 0.3926309645175934
647
+ ],
648
+ [
649
+ 0.30422940850257874,
650
+ 0.2619917392730713,
651
+ 0.4042761027812958
652
+ ],
653
+ [
654
+ 0.3722664415836334,
655
+ 0.3104734420776367,
656
+ 0.43920591473579407
657
+ ],
658
+ [
659
+ 0.3971478044986725,
660
+ 0.4206995368003845,
661
+ 0.4538395404815674
662
+ ],
663
+ [
664
+ 0.14130696654319763,
665
+ 0.0906684622168541,
666
+ 0.317769318819046
667
+ ],
668
+ [
669
+ 0.12399625778198242,
670
+ 0.22572502493858337,
671
+ 0.0960737019777298
672
+ ],
673
+ [
674
+ 0.2981469929218292,
675
+ 0.32715675234794617,
676
+ 0.222996786236763
677
+ ],
678
+ [
679
+ 0.32909661531448364,
680
+ 0.12655764818191528,
681
+ 0.18494075536727905
682
+ ],
683
+ [
684
+ 0.37708091735839844,
685
+ 0.3752113878726959,
686
+ 0.2378169596195221
687
+ ],
688
+ [
689
+ 0.4334565997123718,
690
+ 0.28838497400283813,
691
+ 0.2449781596660614
692
+ ],
693
+ [
694
+ 0.2883698344230652,
695
+ 0.305515319108963,
696
+ 0.4391908049583435
697
+ ],
698
+ [
699
+ 0.19125717878341675,
700
+ 0.2584020793437958,
701
+ 0.0925312489271164
702
+ ],
703
+ [
704
+ 0.12187911570072174,
705
+ 0.4335261881351471,
706
+ 0.520975649356842
707
+ ],
708
+ [
709
+ 0.2620433568954468,
710
+ 0.29375919699668884,
711
+ 0.11187328398227692
712
+ ],
713
+ [
714
+ 0.17772765457630157,
715
+ 0.2512510120868683,
716
+ 0.10690949112176895
717
+ ],
718
+ [
719
+ 0.22700193524360657,
720
+ 0.2991064488887787,
721
+ 0.4500381052494049
722
+ ],
723
+ [
724
+ 0.1289265751838684,
725
+ 0.36251145601272583,
726
+ 0.5151251554489136
727
+ ],
728
+ [
729
+ 0.2865338623523712,
730
+ 0.3484944999217987,
731
+ 0.26353734731674194
732
+ ],
733
+ [
734
+ 0.485024094581604,
735
+ 0.44095250964164734,
736
+ 0.46622908115386963
737
+ ],
738
+ [
739
+ 0.20954765379428864,
740
+ 0.24857118725776672,
741
+ 0.2822284996509552
742
+ ],
743
+ [
744
+ 0.37741464376449585,
745
+ 0.3521379232406616,
746
+ 0.30786165595054626
747
+ ],
748
+ [
749
+ 0.47667068243026733,
750
+ 0.3580416738986969,
751
+ 0.34098020195961
752
+ ],
753
+ [
754
+ 0.3875373601913452,
755
+ 0.4215651750564575,
756
+ 0.17950256168842316
757
+ ],
758
+ [
759
+ 0.45050615072250366,
760
+ 0.30080941319465637,
761
+ 0.22057965397834778
762
+ ],
763
+ [
764
+ 0.4736686646938324,
765
+ 0.2991064488887787,
766
+ 0.20745421946048737
767
+ ],
768
+ [
769
+ 0.1201881691813469,
770
+ 0.0765191912651062,
771
+ 0.49064430594444275
772
+ ],
773
+ [
774
+ 0.300072580575943,
775
+ 0.4376808702945709,
776
+ 0.320193886756897
777
+ ],
778
+ [
779
+ 0.4925941228866577,
780
+ 0.4926280677318573,
781
+ 0.48907238245010376
782
+ ],
783
+ [
784
+ 0.4083403944969177,
785
+ 0.4083535075187683,
786
+ 0.2632165849208832
787
+ ],
788
+ [
789
+ 0.2709487974643707,
790
+ 0.3979795575141907,
791
+ 0.20749370753765106
792
+ ],
793
+ [
794
+ 0.1511135995388031,
795
+ 0.2525557279586792,
796
+ 0.46267417073249817
797
+ ],
798
+ [
799
+ 0.43069979548454285,
800
+ 0.4463596045970917,
801
+ 0.44006508588790894
802
+ ],
803
+ [
804
+ 0.4657745361328125,
805
+ 0.17656829953193665,
806
+ 0.5156789422035217
807
+ ],
808
+ [
809
+ 0.4859452545642853,
810
+ 0.2991064488887787,
811
+ 0.09823232144117355
812
+ ],
813
+ [
814
+ 0.5132853984832764,
815
+ 0.41074973344802856,
816
+ 0.3149101436138153
817
+ ],
818
+ [
819
+ 0.31832626461982727,
820
+ 0.44066089391708374,
821
+ 0.43647849559783936
822
+ ],
823
+ [
824
+ 0.42801976203918457,
825
+ 0.50859534740448,
826
+ 0.41251909732818604
827
+ ],
828
+ [
829
+ 0.4144328236579895,
830
+ 0.2526160776615143,
831
+ 0.3779384195804596
832
+ ],
833
+ [
834
+ 0.1460304856300354,
835
+ 0.12041312456130981,
836
+ 0.4572620391845703
837
+ ],
838
+ [
839
+ 0.5156528949737549,
840
+ 0.1857970654964447,
841
+ 0.40913957357406616
842
+ ],
843
+ [
844
+ 0.5143671631813049,
845
+ 0.13157890737056732,
846
+ 0.3817857801914215
847
+ ],
848
+ [
849
+ 0.2841498553752899,
850
+ 0.23539577424526215,
851
+ 0.494221031665802
852
+ ],
853
+ [
854
+ 0.48443642258644104,
855
+ 0.2991064488887787,
856
+ 0.09981511533260345
857
+ ],
858
+ [
859
+ 0.43630746006965637,
860
+ 0.18339860439300537,
861
+ 0.4994329512119293
862
+ ],
863
+ [
864
+ 0.26429444551467896,
865
+ 0.3679955303668976,
866
+ 0.4053182303905487
867
+ ],
868
+ [
869
+ 0.38882166147232056,
870
+ 0.34310269355773926,
871
+ 0.3995572626590729
872
+ ],
873
+ [
874
+ 0.442795068025589,
875
+ 0.5109413862228394,
876
+ 0.28418371081352234
877
+ ],
878
+ [
879
+ 0.46458667516708374,
880
+ 0.3953208029270172,
881
+ 0.09362320601940155
882
+ ],
883
+ [
884
+ 0.26032084226608276,
885
+ 0.0865219458937645,
886
+ 0.34850838780403137
887
+ ],
888
+ [
889
+ 0.4535272717475891,
890
+ 0.10438241064548492,
891
+ 0.09257187694311142
892
+ ],
893
+ [
894
+ 0.483107328414917,
895
+ 0.20113283395767212,
896
+ 0.13865788280963898
897
+ ],
898
+ [
899
+ 0.23030772805213928,
900
+ 0.2991064488887787,
901
+ 0.35963404178619385
902
+ ],
903
+ [
904
+ 0.16338224709033966,
905
+ 0.39149951934814453,
906
+ 0.49659231305122375
907
+ ],
908
+ [
909
+ 0.3294917047023773,
910
+ 0.13282619416713715,
911
+ 0.20655612647533417
912
+ ],
913
+ [
914
+ 0.4319312572479248,
915
+ 0.17182399332523346,
916
+ 0.2396230846643448
917
+ ],
918
+ [
919
+ 0.36332961916923523,
920
+ 0.36037957668304443,
921
+ 0.3429306447505951
922
+ ],
923
+ [
924
+ 0.3616918623447418,
925
+ 0.2938624918460846,
926
+ 0.38735777139663696
927
+ ],
928
+ [
929
+ 0.4023398756980896,
930
+ 0.12583358585834503,
931
+ 0.10058364272117615
932
+ ],
933
+ [
934
+ 0.41262099146842957,
935
+ 0.07252582907676697,
936
+ 0.07743567228317261
937
+ ],
938
+ [
939
+ 0.23652240633964539,
940
+ 0.4959789514541626,
941
+ 0.44803544878959656
942
+ ],
943
+ [
944
+ 0.19972018897533417,
945
+ 0.2991064488887787,
946
+ 0.3483804166316986
947
+ ],
948
+ [
949
+ 0.16351991891860962,
950
+ 0.24982401728630066,
951
+ 0.2721799612045288
952
+ ],
953
+ [
954
+ 0.20436346530914307,
955
+ 0.45048490166664124,
956
+ 0.39567479491233826
957
+ ],
958
+ [
959
+ 0.17335303127765656,
960
+ 0.1786803901195526,
961
+ 0.28799256682395935
962
+ ],
963
+ [
964
+ 0.39312097430229187,
965
+ 0.3497301936149597,
966
+ 0.474042147397995
967
+ ],
968
+ [
969
+ 0.4254207909107208,
970
+ 0.18329524993896484,
971
+ 0.16384217143058777
972
+ ],
973
+ [
974
+ 0.3333684504032135,
975
+ 0.4648290276527405,
976
+ 0.39648154377937317
977
+ ],
978
+ [
979
+ 0.31693828105926514,
980
+ 0.19005124270915985,
981
+ 0.13795344531536102
982
+ ],
983
+ [
984
+ 0.4066091775894165,
985
+ 0.4892934262752533,
986
+ 0.37719210982322693
987
+ ],
988
+ [
989
+ 0.20348678529262543,
990
+ 0.2991064488887787,
991
+ 0.4577203392982483
992
+ ],
993
+ [
994
+ 0.1860705316066742,
995
+ 0.10935922712087631,
996
+ 0.4284401834011078
997
+ ],
998
+ [
999
+ 0.37848761677742004,
1000
+ 0.2146831601858139,
1001
+ 0.21267467737197876
1002
+ ],
1003
+ [
1004
+ 0.21149250864982605,
1005
+ 0.14368967711925507,
1006
+ 0.45519518852233887
1007
+ ],
1008
+ [
1009
+ 0.06408258527517319,
1010
+ 0.126949742436409,
1011
+ 0.09714365750551224
1012
+ ],
1013
+ [
1014
+ 0.48537588119506836,
1015
+ 0.17013110220432281,
1016
+ 0.316438764333725
1017
+ ],
1018
+ [
1019
+ 0.17450174689292908,
1020
+ 0.08588557690382004,
1021
+ 0.10486020147800446
1022
+ ],
1023
+ [
1024
+ 0.20353493094444275,
1025
+ 0.16379176080226898,
1026
+ 0.1968519240617752
1027
+ ],
1028
+ [
1029
+ 0.46962839365005493,
1030
+ 0.05759061872959137,
1031
+ 0.2683435082435608
1032
+ ],
1033
+ [
1034
+ 0.2805038392543793,
1035
+ 0.22482611238956451,
1036
+ 0.22982996702194214
1037
+ ],
1038
+ [
1039
+ 0.2569133937358856,
1040
+ 0.18731571733951569,
1041
+ 0.5194160342216492
1042
+ ],
1043
+ [
1044
+ 0.24735479056835175,
1045
+ 0.21759486198425293,
1046
+ 0.24238333106040955
1047
+ ],
1048
+ [
1049
+ 0.27775418758392334,
1050
+ 0.2548099756240845,
1051
+ 0.30243727564811707
1052
+ ],
1053
+ [
1054
+ 0.2576184868812561,
1055
+ 0.14361393451690674,
1056
+ 0.1337978094816208
1057
+ ],
1058
+ [
1059
+ 0.4080156981945038,
1060
+ 0.2382960319519043,
1061
+ 0.4400182366371155
1062
+ ],
1063
+ [
1064
+ 0.458919495344162,
1065
+ 0.4059380292892456,
1066
+ 0.49277982115745544
1067
+ ],
1068
+ [
1069
+ 0.2798698842525482,
1070
+ 0.14483630657196045,
1071
+ 0.09784607589244843
1072
+ ],
1073
+ [
1074
+ 0.38833189010620117,
1075
+ 0.17383930087089539,
1076
+ 0.2578243613243103
1077
+ ],
1078
+ [
1079
+ 0.25024887919425964,
1080
+ 0.2991064488887787,
1081
+ 0.38303831219673157
1082
+ ],
1083
+ [
1084
+ 0.14790815114974976,
1085
+ 0.37593668699264526,
1086
+ 0.4081784188747406
1087
+ ],
1088
+ [
1089
+ 0.13535337150096893,
1090
+ 0.25366485118865967,
1091
+ 0.2562130093574524
1092
+ ],
1093
+ [
1094
+ 0.3645854890346527,
1095
+ 0.3325779438018799,
1096
+ 0.31385961174964905
1097
+ ],
1098
+ [
1099
+ 0.38783320784568787,
1100
+ 0.5389228463172913,
1101
+ 0.09164642542600632
1102
+ ],
1103
+ [
1104
+ 0.4375753700733185,
1105
+ 0.2207290083169937,
1106
+ 0.0790676474571228
1107
+ ],
1108
+ [
1109
+ 0.1504139006137848,
1110
+ 0.1405237913131714,
1111
+ 0.36918333172798157
1112
+ ],
1113
+ [
1114
+ 0.4786643087863922,
1115
+ 0.2303757667541504,
1116
+ 0.45131123065948486
1117
+ ],
1118
+ [
1119
+ 0.47834450006484985,
1120
+ 0.4759082496166229,
1121
+ 0.3822314739227295
1122
+ ],
1123
+ [
1124
+ 0.3665286898612976,
1125
+ 0.28881487250328064,
1126
+ 0.39857780933380127
1127
+ ],
1128
+ [
1129
+ 0.4022678732872009,
1130
+ 0.1792045533657074,
1131
+ 0.1838655322790146
1132
+ ],
1133
+ [
1134
+ 0.4179491698741913,
1135
+ 0.32333338260650635,
1136
+ 0.20314054191112518
1137
+ ],
1138
+ [
1139
+ 0.3287753164768219,
1140
+ 0.2667590379714966,
1141
+ 0.24222955107688904
1142
+ ],
1143
+ [
1144
+ 0.4290072023868561,
1145
+ 0.1833399534225464,
1146
+ 0.2604474425315857
1147
+ ],
1148
+ [
1149
+ 0.4443751275539398,
1150
+ 0.24596300721168518,
1151
+ 0.1444171965122223
1152
+ ],
1153
+ [
1154
+ 0.3404672145843506,
1155
+ 0.21383433043956757,
1156
+ 0.31129592657089233
1157
+ ],
1158
+ [
1159
+ 0.5137952566146851,
1160
+ 0.4982464611530304,
1161
+ 0.07024335116147995
1162
+ ],
1163
+ [
1164
+ 0.49399736523628235,
1165
+ 0.2877163290977478,
1166
+ 0.3380875885486603
1167
+ ],
1168
+ [
1169
+ 0.327199250459671,
1170
+ 0.2817046046257019,
1171
+ 0.2542213499546051
1172
+ ],
1173
+ [
1174
+ 0.10000038892030716,
1175
+ 0.15381278097629547,
1176
+ 0.5144333839416504
1177
+ ],
1178
+ [
1179
+ 0.16834580898284912,
1180
+ 0.09966207295656204,
1181
+ 0.12632368505001068
1182
+ ],
1183
+ [
1184
+ 0.12762250006198883,
1185
+ 0.18673449754714966,
1186
+ 0.17132781445980072
1187
+ ],
1188
+ [
1189
+ 0.44274935126304626,
1190
+ 0.1112779974937439,
1191
+ 0.27043417096138
1192
+ ],
1193
+ [
1194
+ 0.43879061937332153,
1195
+ 0.1323028802871704,
1196
+ 0.46524766087532043
1197
+ ],
1198
+ [
1199
+ 0.16629068553447723,
1200
+ 0.3527180552482605,
1201
+ 0.4832483232021332
1202
+ ],
1203
+ [
1204
+ 0.3618340790271759,
1205
+ 0.10316675901412964,
1206
+ 0.36823585629463196
1207
+ ],
1208
+ [
1209
+ 0.5210177302360535,
1210
+ 0.5375415086746216,
1211
+ 0.24115559458732605
1212
+ ],
1213
+ [
1214
+ 0.2153366059064865,
1215
+ 0.2991064488887787,
1216
+ 0.5040788650512695
1217
+ ],
1218
+ [
1219
+ 0.4762091338634491,
1220
+ 0.43891215324401855,
1221
+ 0.20857597887516022
1222
+ ],
1223
+ [
1224
+ 0.2142956405878067,
1225
+ 0.07792502641677856,
1226
+ 0.18652263283729553
1227
+ ],
1228
+ [
1229
+ 0.14038068056106567,
1230
+ 0.3526850640773773,
1231
+ 0.29758092761039734
1232
+ ],
1233
+ [
1234
+ 0.21224525570869446,
1235
+ 0.049455415457487106,
1236
+ 0.36318737268447876
1237
+ ],
1238
+ [
1239
+ 0.48580172657966614,
1240
+ 0.23261341452598572,
1241
+ 0.48377740383148193
1242
+ ],
1243
+ [
1244
+ 0.4313357174396515,
1245
+ 0.07871584594249725,
1246
+ 0.44780731201171875
1247
+ ],
1248
+ [
1249
+ 0.5047141909599304,
1250
+ 0.21332085132598877,
1251
+ 0.12201394140720367
1252
+ ],
1253
+ [
1254
+ 0.5233346223831177,
1255
+ 0.549716591835022,
1256
+ 0.5150695443153381
1257
+ ],
1258
+ [
1259
+ 0.37415826320648193,
1260
+ 0.2991064488887787,
1261
+ 0.23874910175800323
1262
+ ],
1263
+ [
1264
+ 0.5222930312156677,
1265
+ 0.3372718393802643,
1266
+ 0.11153191328048706
1267
+ ],
1268
+ [
1269
+ 0.10959961265325546,
1270
+ 0.16183240711688995,
1271
+ 0.3456527888774872
1272
+ ],
1273
+ [
1274
+ 0.09844541549682617,
1275
+ 0.13007114827632904,
1276
+ 0.09192629158496857
1277
+ ],
1278
+ [
1279
+ 0.1424628645181656,
1280
+ 0.08010070770978928,
1281
+ 0.2732318639755249
1282
+ ],
1283
+ [
1284
+ 0.4450303912162781,
1285
+ 0.35738125443458557,
1286
+ 0.09468387067317963
1287
+ ],
1288
+ [
1289
+ 0.5237003564834595,
1290
+ 0.3550294041633606,
1291
+ 0.28264713287353516
1292
+ ],
1293
+ [
1294
+ 0.2680354118347168,
1295
+ 0.4269379675388336,
1296
+ 0.1289188265800476
1297
+ ],
1298
+ [
1299
+ 0.43738386034965515,
1300
+ 0.2850746810436249,
1301
+ 0.5585570931434631
1302
+ ],
1303
+ [
1304
+ 0.14190304279327393,
1305
+ 0.2060164362192154,
1306
+ 0.4268914759159088
1307
+ ],
1308
+ [
1309
+ 0.15678735077381134,
1310
+ 0.16231153905391693,
1311
+ 0.3096526563167572
1312
+ ],
1313
+ [
1314
+ 0.5190868377685547,
1315
+ 0.04313768818974495,
1316
+ 0.1366215944290161
1317
+ ],
1318
+ [
1319
+ 0.09683114290237427,
1320
+ 0.10919574648141861,
1321
+ 0.5136429071426392
1322
+ ],
1323
+ [
1324
+ 0.13495291769504547,
1325
+ 0.15045689046382904,
1326
+ 0.3470092713832855
1327
+ ],
1328
+ [
1329
+ 0.474563330411911,
1330
+ 0.09936200082302094,
1331
+ 0.5089728832244873
1332
+ ],
1333
+ [
1334
+ 0.15765617787837982,
1335
+ 0.2659124433994293,
1336
+ 0.20774881541728973
1337
+ ],
1338
+ [
1339
+ 0.15137054026126862,
1340
+ 0.08040495216846466,
1341
+ 0.2648612856864929
1342
+ ],
1343
+ [
1344
+ 0.4117078483104706,
1345
+ 0.33648771047592163,
1346
+ 0.4998159408569336
1347
+ ],
1348
+ [
1349
+ 0.12321481108665466,
1350
+ 0.2991064488887787,
1351
+ 0.5345920920372009
1352
+ ],
1353
+ [
1354
+ 0.3017975091934204,
1355
+ 0.11800071597099304,
1356
+ 0.4122851490974426
1357
+ ],
1358
+ [
1359
+ 0.2902871072292328,
1360
+ 0.4884568750858307,
1361
+ 0.181283101439476
1362
+ ],
1363
+ [
1364
+ 0.22598855197429657,
1365
+ 0.4938979148864746,
1366
+ 0.14767569303512573
1367
+ ],
1368
+ [
1369
+ 0.5131941437721252,
1370
+ 0.21240125596523285,
1371
+ 0.5198010206222534
1372
+ ],
1373
+ [
1374
+ 0.5463508367538452,
1375
+ 0.3957848846912384,
1376
+ 0.5018388628959656
1377
+ ],
1378
+ [
1379
+ 0.5357264876365662,
1380
+ 0.5206008553504944,
1381
+ 0.2613532245159149
1382
+ ],
1383
+ [
1384
+ 0.03062376007437706,
1385
+ 0.3900336027145386,
1386
+ 0.20465917885303497
1387
+ ],
1388
+ [
1389
+ 0.5006649494171143,
1390
+ 0.3670070767402649,
1391
+ 0.46149134635925293
1392
+ ],
1393
+ [
1394
+ 0.12659955024719238,
1395
+ 0.25696825981140137,
1396
+ 0.46400728821754456
1397
+ ],
1398
+ [
1399
+ 0.11900944262742996,
1400
+ 0.32805705070495605,
1401
+ 0.495664119720459
1402
+ ],
1403
+ [
1404
+ 0.14024464786052704,
1405
+ 0.23959797620773315,
1406
+ 0.14779651165008545
1407
+ ],
1408
+ [
1409
+ 0.22821757197380066,
1410
+ 0.2729969024658203,
1411
+ 0.4627695679664612
1412
+ ],
1413
+ [
1414
+ 0.37860631942749023,
1415
+ 0.12604811787605286,
1416
+ 0.4537806212902069
1417
+ ],
1418
+ [
1419
+ 0.551247775554657,
1420
+ 0.3835128843784332,
1421
+ 0.503082811832428
1422
+ ],
1423
+ [
1424
+ 0.35379788279533386,
1425
+ 0.31738343834877014,
1426
+ 0.5127444267272949
1427
+ ],
1428
+ [
1429
+ 0.25742167234420776,
1430
+ 0.05766354128718376,
1431
+ 0.47457870841026306
1432
+ ],
1433
+ [
1434
+ 0.1496676504611969,
1435
+ 0.489128440618515,
1436
+ 0.5340008735656738
1437
+ ],
1438
+ [
1439
+ 0.27022284269332886,
1440
+ 0.2991064488887787,
1441
+ 0.08482208102941513
1442
+ ],
1443
+ [
1444
+ 0.33586573600769043,
1445
+ 0.2991064488887787,
1446
+ 0.08091895282268524
1447
+ ],
1448
+ [
1449
+ 0.5533952713012695,
1450
+ 0.5185452699661255,
1451
+ 0.05543660745024681
1452
+ ],
1453
+ [
1454
+ -0.04444197565317154,
1455
+ 0.673144519329071,
1456
+ 0.009347585029900074
1457
+ ]
1458
+ ],
1459
+ "model_names": [
1460
+ "instruct",
1461
+ "math",
1462
+ "code"
1463
+ ],
1464
+ "num_models": 3,
1465
+ "num_params": 291,
1466
+ "param_names": [
1467
+ "model.embed_tokens.weight",
1468
+ "model.layers.0.self_attn.q_proj.weight",
1469
+ "model.layers.0.self_attn.k_proj.weight",
1470
+ "model.layers.0.self_attn.v_proj.weight",
1471
+ "model.layers.0.self_attn.o_proj.weight",
1472
+ "model.layers.0.mlp.gate_proj.weight",
1473
+ "model.layers.0.mlp.up_proj.weight",
1474
+ "model.layers.0.mlp.down_proj.weight",
1475
+ "model.layers.0.input_layernorm.weight",
1476
+ "model.layers.0.post_attention_layernorm.weight",
1477
+ "model.layers.1.self_attn.q_proj.weight",
1478
+ "model.layers.1.self_attn.k_proj.weight",
1479
+ "model.layers.1.self_attn.v_proj.weight",
1480
+ "model.layers.1.self_attn.o_proj.weight",
1481
+ "model.layers.1.mlp.gate_proj.weight",
1482
+ "model.layers.1.mlp.up_proj.weight",
1483
+ "model.layers.1.mlp.down_proj.weight",
1484
+ "model.layers.1.input_layernorm.weight",
1485
+ "model.layers.1.post_attention_layernorm.weight",
1486
+ "model.layers.2.self_attn.q_proj.weight",
1487
+ "model.layers.2.self_attn.k_proj.weight",
1488
+ "model.layers.2.self_attn.v_proj.weight",
1489
+ "model.layers.2.self_attn.o_proj.weight",
1490
+ "model.layers.2.mlp.gate_proj.weight",
1491
+ "model.layers.2.mlp.up_proj.weight",
1492
+ "model.layers.2.mlp.down_proj.weight",
1493
+ "model.layers.2.input_layernorm.weight",
1494
+ "model.layers.2.post_attention_layernorm.weight",
1495
+ "model.layers.3.self_attn.q_proj.weight",
1496
+ "model.layers.3.self_attn.k_proj.weight",
1497
+ "model.layers.3.self_attn.v_proj.weight",
1498
+ "model.layers.3.self_attn.o_proj.weight",
1499
+ "model.layers.3.mlp.gate_proj.weight",
1500
+ "model.layers.3.mlp.up_proj.weight",
1501
+ "model.layers.3.mlp.down_proj.weight",
1502
+ "model.layers.3.input_layernorm.weight",
1503
+ "model.layers.3.post_attention_layernorm.weight",
1504
+ "model.layers.4.self_attn.q_proj.weight",
1505
+ "model.layers.4.self_attn.k_proj.weight",
1506
+ "model.layers.4.self_attn.v_proj.weight",
1507
+ "model.layers.4.self_attn.o_proj.weight",
1508
+ "model.layers.4.mlp.gate_proj.weight",
1509
+ "model.layers.4.mlp.up_proj.weight",
1510
+ "model.layers.4.mlp.down_proj.weight",
1511
+ "model.layers.4.input_layernorm.weight",
1512
+ "model.layers.4.post_attention_layernorm.weight",
1513
+ "model.layers.5.self_attn.q_proj.weight",
1514
+ "model.layers.5.self_attn.k_proj.weight",
1515
+ "model.layers.5.self_attn.v_proj.weight",
1516
+ "model.layers.5.self_attn.o_proj.weight",
1517
+ "model.layers.5.mlp.gate_proj.weight",
1518
+ "model.layers.5.mlp.up_proj.weight",
1519
+ "model.layers.5.mlp.down_proj.weight",
1520
+ "model.layers.5.input_layernorm.weight",
1521
+ "model.layers.5.post_attention_layernorm.weight",
1522
+ "model.layers.6.self_attn.q_proj.weight",
1523
+ "model.layers.6.self_attn.k_proj.weight",
1524
+ "model.layers.6.self_attn.v_proj.weight",
1525
+ "model.layers.6.self_attn.o_proj.weight",
1526
+ "model.layers.6.mlp.gate_proj.weight",
1527
+ "model.layers.6.mlp.up_proj.weight",
1528
+ "model.layers.6.mlp.down_proj.weight",
1529
+ "model.layers.6.input_layernorm.weight",
1530
+ "model.layers.6.post_attention_layernorm.weight",
1531
+ "model.layers.7.self_attn.q_proj.weight",
1532
+ "model.layers.7.self_attn.k_proj.weight",
1533
+ "model.layers.7.self_attn.v_proj.weight",
1534
+ "model.layers.7.self_attn.o_proj.weight",
1535
+ "model.layers.7.mlp.gate_proj.weight",
1536
+ "model.layers.7.mlp.up_proj.weight",
1537
+ "model.layers.7.mlp.down_proj.weight",
1538
+ "model.layers.7.input_layernorm.weight",
1539
+ "model.layers.7.post_attention_layernorm.weight",
1540
+ "model.layers.8.self_attn.q_proj.weight",
1541
+ "model.layers.8.self_attn.k_proj.weight",
1542
+ "model.layers.8.self_attn.v_proj.weight",
1543
+ "model.layers.8.self_attn.o_proj.weight",
1544
+ "model.layers.8.mlp.gate_proj.weight",
1545
+ "model.layers.8.mlp.up_proj.weight",
1546
+ "model.layers.8.mlp.down_proj.weight",
1547
+ "model.layers.8.input_layernorm.weight",
1548
+ "model.layers.8.post_attention_layernorm.weight",
1549
+ "model.layers.9.self_attn.q_proj.weight",
1550
+ "model.layers.9.self_attn.k_proj.weight",
1551
+ "model.layers.9.self_attn.v_proj.weight",
1552
+ "model.layers.9.self_attn.o_proj.weight",
1553
+ "model.layers.9.mlp.gate_proj.weight",
1554
+ "model.layers.9.mlp.up_proj.weight",
1555
+ "model.layers.9.mlp.down_proj.weight",
1556
+ "model.layers.9.input_layernorm.weight",
1557
+ "model.layers.9.post_attention_layernorm.weight",
1558
+ "model.layers.10.self_attn.q_proj.weight",
1559
+ "model.layers.10.self_attn.k_proj.weight",
1560
+ "model.layers.10.self_attn.v_proj.weight",
1561
+ "model.layers.10.self_attn.o_proj.weight",
1562
+ "model.layers.10.mlp.gate_proj.weight",
1563
+ "model.layers.10.mlp.up_proj.weight",
1564
+ "model.layers.10.mlp.down_proj.weight",
1565
+ "model.layers.10.input_layernorm.weight",
1566
+ "model.layers.10.post_attention_layernorm.weight",
1567
+ "model.layers.11.self_attn.q_proj.weight",
1568
+ "model.layers.11.self_attn.k_proj.weight",
1569
+ "model.layers.11.self_attn.v_proj.weight",
1570
+ "model.layers.11.self_attn.o_proj.weight",
1571
+ "model.layers.11.mlp.gate_proj.weight",
1572
+ "model.layers.11.mlp.up_proj.weight",
1573
+ "model.layers.11.mlp.down_proj.weight",
1574
+ "model.layers.11.input_layernorm.weight",
1575
+ "model.layers.11.post_attention_layernorm.weight",
1576
+ "model.layers.12.self_attn.q_proj.weight",
1577
+ "model.layers.12.self_attn.k_proj.weight",
1578
+ "model.layers.12.self_attn.v_proj.weight",
1579
+ "model.layers.12.self_attn.o_proj.weight",
1580
+ "model.layers.12.mlp.gate_proj.weight",
1581
+ "model.layers.12.mlp.up_proj.weight",
1582
+ "model.layers.12.mlp.down_proj.weight",
1583
+ "model.layers.12.input_layernorm.weight",
1584
+ "model.layers.12.post_attention_layernorm.weight",
1585
+ "model.layers.13.self_attn.q_proj.weight",
1586
+ "model.layers.13.self_attn.k_proj.weight",
1587
+ "model.layers.13.self_attn.v_proj.weight",
1588
+ "model.layers.13.self_attn.o_proj.weight",
1589
+ "model.layers.13.mlp.gate_proj.weight",
1590
+ "model.layers.13.mlp.up_proj.weight",
1591
+ "model.layers.13.mlp.down_proj.weight",
1592
+ "model.layers.13.input_layernorm.weight",
1593
+ "model.layers.13.post_attention_layernorm.weight",
1594
+ "model.layers.14.self_attn.q_proj.weight",
1595
+ "model.layers.14.self_attn.k_proj.weight",
1596
+ "model.layers.14.self_attn.v_proj.weight",
1597
+ "model.layers.14.self_attn.o_proj.weight",
1598
+ "model.layers.14.mlp.gate_proj.weight",
1599
+ "model.layers.14.mlp.up_proj.weight",
1600
+ "model.layers.14.mlp.down_proj.weight",
1601
+ "model.layers.14.input_layernorm.weight",
1602
+ "model.layers.14.post_attention_layernorm.weight",
1603
+ "model.layers.15.self_attn.q_proj.weight",
1604
+ "model.layers.15.self_attn.k_proj.weight",
1605
+ "model.layers.15.self_attn.v_proj.weight",
1606
+ "model.layers.15.self_attn.o_proj.weight",
1607
+ "model.layers.15.mlp.gate_proj.weight",
1608
+ "model.layers.15.mlp.up_proj.weight",
1609
+ "model.layers.15.mlp.down_proj.weight",
1610
+ "model.layers.15.input_layernorm.weight",
1611
+ "model.layers.15.post_attention_layernorm.weight",
1612
+ "model.layers.16.self_attn.q_proj.weight",
1613
+ "model.layers.16.self_attn.k_proj.weight",
1614
+ "model.layers.16.self_attn.v_proj.weight",
1615
+ "model.layers.16.self_attn.o_proj.weight",
1616
+ "model.layers.16.mlp.gate_proj.weight",
1617
+ "model.layers.16.mlp.up_proj.weight",
1618
+ "model.layers.16.mlp.down_proj.weight",
1619
+ "model.layers.16.input_layernorm.weight",
1620
+ "model.layers.16.post_attention_layernorm.weight",
1621
+ "model.layers.17.self_attn.q_proj.weight",
1622
+ "model.layers.17.self_attn.k_proj.weight",
1623
+ "model.layers.17.self_attn.v_proj.weight",
1624
+ "model.layers.17.self_attn.o_proj.weight",
1625
+ "model.layers.17.mlp.gate_proj.weight",
1626
+ "model.layers.17.mlp.up_proj.weight",
1627
+ "model.layers.17.mlp.down_proj.weight",
1628
+ "model.layers.17.input_layernorm.weight",
1629
+ "model.layers.17.post_attention_layernorm.weight",
1630
+ "model.layers.18.self_attn.q_proj.weight",
1631
+ "model.layers.18.self_attn.k_proj.weight",
1632
+ "model.layers.18.self_attn.v_proj.weight",
1633
+ "model.layers.18.self_attn.o_proj.weight",
1634
+ "model.layers.18.mlp.gate_proj.weight",
1635
+ "model.layers.18.mlp.up_proj.weight",
1636
+ "model.layers.18.mlp.down_proj.weight",
1637
+ "model.layers.18.input_layernorm.weight",
1638
+ "model.layers.18.post_attention_layernorm.weight",
1639
+ "model.layers.19.self_attn.q_proj.weight",
1640
+ "model.layers.19.self_attn.k_proj.weight",
1641
+ "model.layers.19.self_attn.v_proj.weight",
1642
+ "model.layers.19.self_attn.o_proj.weight",
1643
+ "model.layers.19.mlp.gate_proj.weight",
1644
+ "model.layers.19.mlp.up_proj.weight",
1645
+ "model.layers.19.mlp.down_proj.weight",
1646
+ "model.layers.19.input_layernorm.weight",
1647
+ "model.layers.19.post_attention_layernorm.weight",
1648
+ "model.layers.20.self_attn.q_proj.weight",
1649
+ "model.layers.20.self_attn.k_proj.weight",
1650
+ "model.layers.20.self_attn.v_proj.weight",
1651
+ "model.layers.20.self_attn.o_proj.weight",
1652
+ "model.layers.20.mlp.gate_proj.weight",
1653
+ "model.layers.20.mlp.up_proj.weight",
1654
+ "model.layers.20.mlp.down_proj.weight",
1655
+ "model.layers.20.input_layernorm.weight",
1656
+ "model.layers.20.post_attention_layernorm.weight",
1657
+ "model.layers.21.self_attn.q_proj.weight",
1658
+ "model.layers.21.self_attn.k_proj.weight",
1659
+ "model.layers.21.self_attn.v_proj.weight",
1660
+ "model.layers.21.self_attn.o_proj.weight",
1661
+ "model.layers.21.mlp.gate_proj.weight",
1662
+ "model.layers.21.mlp.up_proj.weight",
1663
+ "model.layers.21.mlp.down_proj.weight",
1664
+ "model.layers.21.input_layernorm.weight",
1665
+ "model.layers.21.post_attention_layernorm.weight",
1666
+ "model.layers.22.self_attn.q_proj.weight",
1667
+ "model.layers.22.self_attn.k_proj.weight",
1668
+ "model.layers.22.self_attn.v_proj.weight",
1669
+ "model.layers.22.self_attn.o_proj.weight",
1670
+ "model.layers.22.mlp.gate_proj.weight",
1671
+ "model.layers.22.mlp.up_proj.weight",
1672
+ "model.layers.22.mlp.down_proj.weight",
1673
+ "model.layers.22.input_layernorm.weight",
1674
+ "model.layers.22.post_attention_layernorm.weight",
1675
+ "model.layers.23.self_attn.q_proj.weight",
1676
+ "model.layers.23.self_attn.k_proj.weight",
1677
+ "model.layers.23.self_attn.v_proj.weight",
1678
+ "model.layers.23.self_attn.o_proj.weight",
1679
+ "model.layers.23.mlp.gate_proj.weight",
1680
+ "model.layers.23.mlp.up_proj.weight",
1681
+ "model.layers.23.mlp.down_proj.weight",
1682
+ "model.layers.23.input_layernorm.weight",
1683
+ "model.layers.23.post_attention_layernorm.weight",
1684
+ "model.layers.24.self_attn.q_proj.weight",
1685
+ "model.layers.24.self_attn.k_proj.weight",
1686
+ "model.layers.24.self_attn.v_proj.weight",
1687
+ "model.layers.24.self_attn.o_proj.weight",
1688
+ "model.layers.24.mlp.gate_proj.weight",
1689
+ "model.layers.24.mlp.up_proj.weight",
1690
+ "model.layers.24.mlp.down_proj.weight",
1691
+ "model.layers.24.input_layernorm.weight",
1692
+ "model.layers.24.post_attention_layernorm.weight",
1693
+ "model.layers.25.self_attn.q_proj.weight",
1694
+ "model.layers.25.self_attn.k_proj.weight",
1695
+ "model.layers.25.self_attn.v_proj.weight",
1696
+ "model.layers.25.self_attn.o_proj.weight",
1697
+ "model.layers.25.mlp.gate_proj.weight",
1698
+ "model.layers.25.mlp.up_proj.weight",
1699
+ "model.layers.25.mlp.down_proj.weight",
1700
+ "model.layers.25.input_layernorm.weight",
1701
+ "model.layers.25.post_attention_layernorm.weight",
1702
+ "model.layers.26.self_attn.q_proj.weight",
1703
+ "model.layers.26.self_attn.k_proj.weight",
1704
+ "model.layers.26.self_attn.v_proj.weight",
1705
+ "model.layers.26.self_attn.o_proj.weight",
1706
+ "model.layers.26.mlp.gate_proj.weight",
1707
+ "model.layers.26.mlp.up_proj.weight",
1708
+ "model.layers.26.mlp.down_proj.weight",
1709
+ "model.layers.26.input_layernorm.weight",
1710
+ "model.layers.26.post_attention_layernorm.weight",
1711
+ "model.layers.27.self_attn.q_proj.weight",
1712
+ "model.layers.27.self_attn.k_proj.weight",
1713
+ "model.layers.27.self_attn.v_proj.weight",
1714
+ "model.layers.27.self_attn.o_proj.weight",
1715
+ "model.layers.27.mlp.gate_proj.weight",
1716
+ "model.layers.27.mlp.up_proj.weight",
1717
+ "model.layers.27.mlp.down_proj.weight",
1718
+ "model.layers.27.input_layernorm.weight",
1719
+ "model.layers.27.post_attention_layernorm.weight",
1720
+ "model.layers.28.self_attn.q_proj.weight",
1721
+ "model.layers.28.self_attn.k_proj.weight",
1722
+ "model.layers.28.self_attn.v_proj.weight",
1723
+ "model.layers.28.self_attn.o_proj.weight",
1724
+ "model.layers.28.mlp.gate_proj.weight",
1725
+ "model.layers.28.mlp.up_proj.weight",
1726
+ "model.layers.28.mlp.down_proj.weight",
1727
+ "model.layers.28.input_layernorm.weight",
1728
+ "model.layers.28.post_attention_layernorm.weight",
1729
+ "model.layers.29.self_attn.q_proj.weight",
1730
+ "model.layers.29.self_attn.k_proj.weight",
1731
+ "model.layers.29.self_attn.v_proj.weight",
1732
+ "model.layers.29.self_attn.o_proj.weight",
1733
+ "model.layers.29.mlp.gate_proj.weight",
1734
+ "model.layers.29.mlp.up_proj.weight",
1735
+ "model.layers.29.mlp.down_proj.weight",
1736
+ "model.layers.29.input_layernorm.weight",
1737
+ "model.layers.29.post_attention_layernorm.weight",
1738
+ "model.layers.30.self_attn.q_proj.weight",
1739
+ "model.layers.30.self_attn.k_proj.weight",
1740
+ "model.layers.30.self_attn.v_proj.weight",
1741
+ "model.layers.30.self_attn.o_proj.weight",
1742
+ "model.layers.30.mlp.gate_proj.weight",
1743
+ "model.layers.30.mlp.up_proj.weight",
1744
+ "model.layers.30.mlp.down_proj.weight",
1745
+ "model.layers.30.input_layernorm.weight",
1746
+ "model.layers.30.post_attention_layernorm.weight",
1747
+ "model.layers.31.self_attn.q_proj.weight",
1748
+ "model.layers.31.self_attn.k_proj.weight",
1749
+ "model.layers.31.self_attn.v_proj.weight",
1750
+ "model.layers.31.self_attn.o_proj.weight",
1751
+ "model.layers.31.mlp.gate_proj.weight",
1752
+ "model.layers.31.mlp.up_proj.weight",
1753
+ "model.layers.31.mlp.down_proj.weight",
1754
+ "model.layers.31.input_layernorm.weight",
1755
+ "model.layers.31.post_attention_layernorm.weight",
1756
+ "model.norm.weight",
1757
+ "lm_head.weight"
1758
+ ]
1759
+ }
logs/save_merged_model_20250617_080653.log ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - Starting merged model save process
2
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - Arguments: {'lambdas_path': '/work/gj26/b20042/LLM-AdaMerge/outputs/mistral-7b/parameter-wise/test-size-dataset/shannon-entropy-loss/llm_adamerge_parameterwise_lambdas.json', 'model_config': '/work/gj26/b20042/LLM-AdaMerge/src/configs/model_config.yaml', 'output_dir': '/work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/test-size-dataset/parameter-wise-shannonentropy', 'model_name': 'merged-model', 'push_to_hub': False, 'hub_repo_id': 'lejelly/test-size-dataset-parameter-wise-llm-adamerge-shannonentropy-mistral-7b-instrcut-math-code', 'private': False, 'device': 'cuda', 'debug': False}
3
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - Loading lambdas from /work/gj26/b20042/LLM-AdaMerge/outputs/mistral-7b/parameter-wise/test-size-dataset/shannon-entropy-loss/llm_adamerge_parameterwise_lambdas.json
4
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - Auto-detected parameter-wise merge from JSON structure
5
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - Merge type: parameter_wise
6
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - [Initial] Memory Usage:
7
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - Process: 0.40 GB (0.2%)
8
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - System: 9.47 GB / 212.52 GB (9.1%)
9
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - Available: 193.22 GB
10
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
11
+ 2025-06-17 08:06:53 - experiment_save_merged_model - INFO - Loading models
12
+ 2025-06-17 08:07:12 - experiment_save_merged_model - INFO - [After loading models] Memory Usage:
13
+ 2025-06-17 08:07:12 - experiment_save_merged_model - INFO - Process: 41.24 GB (19.4%)
14
+ 2025-06-17 08:07:12 - experiment_save_merged_model - INFO - System: 49.32 GB / 212.52 GB (31.1%)
15
+ 2025-06-17 08:07:12 - experiment_save_merged_model - INFO - Available: 146.36 GB
16
+ 2025-06-17 08:07:12 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
17
+ 2025-06-17 08:07:12 - experiment_save_merged_model - INFO - Initializing parameter_wise AdaMerge
18
+ 2025-06-17 08:08:26 - experiment_save_merged_model - INFO - Loading learned lambdas
19
+ 2025-06-17 08:08:26 - experiment_save_merged_model - INFO - Deleting original models to free memory (task vectors already computed)
20
+ 2025-06-17 08:08:26 - experiment_save_merged_model - INFO - [Before deleting models] Memory Usage:
21
+ 2025-06-17 08:08:26 - experiment_save_merged_model - INFO - Process: 95.39 GB (44.9%)
22
+ 2025-06-17 08:08:26 - experiment_save_merged_model - INFO - System: 90.40 GB / 212.52 GB (50.5%)
23
+ 2025-06-17 08:08:26 - experiment_save_merged_model - INFO - Available: 105.22 GB
24
+ 2025-06-17 08:08:26 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
25
+ 2025-06-17 08:08:26 - experiment_save_merged_model - INFO - Clearing model_loader references
26
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - Deleting model variables
27
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - Running garbage collection
28
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - [After deleting models and GC] Memory Usage:
29
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - Process: 56.07 GB (26.4%)
30
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - System: 64.68 GB / 212.52 GB (38.4%)
31
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - Available: 130.94 GB
32
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
33
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - [After loading lambdas] Memory Usage:
34
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - Process: 56.07 GB (26.4%)
35
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - System: 64.68 GB / 212.52 GB (38.4%)
36
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - Available: 130.94 GB
37
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
38
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - Creating merged model with learned lambdas
39
+ 2025-06-17 08:08:27 - experiment_save_merged_model - INFO - Using merge_models_for_save() for parameter-wise merge
40
+ 2025-06-17 08:10:20 - experiment_save_merged_model - INFO - [After merging models] Memory Usage:
41
+ 2025-06-17 08:10:20 - experiment_save_merged_model - INFO - Process: 57.94 GB (27.3%)
42
+ 2025-06-17 08:10:20 - experiment_save_merged_model - INFO - System: 93.60 GB / 212.52 GB (48.8%)
43
+ 2025-06-17 08:10:20 - experiment_save_merged_model - INFO - Available: 108.72 GB
44
+ 2025-06-17 08:10:20 - experiment_save_merged_model - INFO - GPU 0: Allocated: 13.49 GB, Reserved: 27.23 GB, Total: 94.50 GB
45
+ 2025-06-17 08:10:20 - experiment_save_merged_model - INFO - Freeing memory from AdaMerge object (task vectors and base params no longer needed)
46
+ 2025-06-17 08:10:20 - experiment_save_merged_model - INFO - Deleting task vectors
47
+ 2025-06-17 08:10:20 - experiment_save_merged_model - INFO - Deleting base params
48
+ 2025-06-17 08:10:21 - experiment_save_merged_model - INFO - Deleting functional model
49
+ 2025-06-17 08:10:21 - experiment_save_merged_model - INFO - [After freeing AdaMerge memory] Memory Usage:
50
+ 2025-06-17 08:10:21 - experiment_save_merged_model - INFO - Process: 5.95 GB (2.8%)
51
+ 2025-06-17 08:10:21 - experiment_save_merged_model - INFO - System: 27.73 GB / 212.52 GB (17.9%)
52
+ 2025-06-17 08:10:21 - experiment_save_merged_model - INFO - Available: 174.59 GB
53
+ 2025-06-17 08:10:21 - experiment_save_merged_model - INFO - GPU 0: Allocated: 13.49 GB, Reserved: 13.62 GB, Total: 94.50 GB
54
+ 2025-06-17 08:10:21 - experiment_save_merged_model - INFO - Saving merged model to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/test-size-dataset/parameter-wise-shannonentropy
55
+ 2025-06-17 08:10:21 - experiment_save_merged_model - INFO - Moving parameter-wise merged model to CPU for saving
56
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Successfully saved 3 safetensors files:
57
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - - model-00001-of-00003.safetensors (4714.17 MB)
58
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - - model-00003-of-00003.safetensors (4330.17 MB)
59
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - - model-00002-of-00003.safetensors (4768.20 MB)
60
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - [After saving model] Memory Usage:
61
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Process: 15.06 GB (7.1%)
62
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - System: 23.39 GB / 212.52 GB (19.0%)
63
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Available: 172.24 GB
64
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
65
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Saving tokenizer
66
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Copied lambdas file to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/test-size-dataset/parameter-wise-shannonentropy/learned_lambdas.json
67
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Creating model card
68
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Cleaning up models
69
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - [After cleanup] Memory Usage:
70
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Process: 3.01 GB (1.4%)
71
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - System: 11.55 GB / 212.52 GB (13.4%)
72
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Available: 184.07 GB
73
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
74
+ 2025-06-17 08:10:55 - experiment_save_merged_model - INFO - Model saved successfully to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/test-size-dataset/parameter-wise-shannonentropy
model-00001-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dce46bf215b1ba630feb35da26f30a3e6212a95fa4532adb54b6122589b9a061
3
+ size 4943162240
model-00002-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a18a1022fe48d72d154e2fde66ebbfff991e7edb97a175f7ad43832141c2382e
3
+ size 4999819232
model-00003-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a2a6c2e2cf19b60f2739e0c0923ec03eaf46dd4de6707bf4a91dbabf6a3be8ec
3
+ size 4540516256
model.safetensors.index.json ADDED
@@ -0,0 +1,298 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 14483464192
4
+ },
5
+ "weight_map": {
6
+ "lm_head.weight": "model-00003-of-00003.safetensors",
7
+ "model.embed_tokens.weight": "model-00001-of-00003.safetensors",
8
+ "model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
9
+ "model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
10
+ "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
11
+ "model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
12
+ "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
13
+ "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
14
+ "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
15
+ "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
16
+ "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
17
+ "model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors",
18
+ "model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
19
+ "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
20
+ "model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
21
+ "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
22
+ "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
23
+ "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
24
+ "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
25
+ "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
26
+ "model.layers.10.input_layernorm.weight": "model-00002-of-00003.safetensors",
27
+ "model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
28
+ "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
29
+ "model.layers.10.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
30
+ "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
31
+ "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
32
+ "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
33
+ "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
34
+ "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
35
+ "model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors",
36
+ "model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
37
+ "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
38
+ "model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
39
+ "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
40
+ "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
41
+ "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
42
+ "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
43
+ "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
44
+ "model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors",
45
+ "model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
46
+ "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
47
+ "model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
48
+ "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
49
+ "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
50
+ "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
51
+ "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
52
+ "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
53
+ "model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors",
54
+ "model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
55
+ "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
56
+ "model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
57
+ "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
58
+ "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
59
+ "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
60
+ "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
61
+ "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
62
+ "model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors",
63
+ "model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
64
+ "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
65
+ "model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
66
+ "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
67
+ "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
68
+ "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
69
+ "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
70
+ "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
71
+ "model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors",
72
+ "model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
73
+ "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
74
+ "model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
75
+ "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
76
+ "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
77
+ "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
78
+ "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
79
+ "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
80
+ "model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors",
81
+ "model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
82
+ "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
83
+ "model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
84
+ "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
85
+ "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
86
+ "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
87
+ "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
88
+ "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
89
+ "model.layers.17.input_layernorm.weight": "model-00002-of-00003.safetensors",
90
+ "model.layers.17.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
91
+ "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
92
+ "model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
93
+ "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
94
+ "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
95
+ "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
96
+ "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
97
+ "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
98
+ "model.layers.18.input_layernorm.weight": "model-00002-of-00003.safetensors",
99
+ "model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
100
+ "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
101
+ "model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
102
+ "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
103
+ "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
104
+ "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
105
+ "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
106
+ "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
107
+ "model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors",
108
+ "model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
109
+ "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
110
+ "model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
111
+ "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
112
+ "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
113
+ "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
114
+ "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
115
+ "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
116
+ "model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors",
117
+ "model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
118
+ "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
119
+ "model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
120
+ "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
121
+ "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
122
+ "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
123
+ "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
124
+ "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
125
+ "model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors",
126
+ "model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
127
+ "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
128
+ "model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
129
+ "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
130
+ "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
131
+ "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
132
+ "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
133
+ "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
134
+ "model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors",
135
+ "model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
136
+ "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
137
+ "model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
138
+ "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
139
+ "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
140
+ "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
141
+ "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
142
+ "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
143
+ "model.layers.22.input_layernorm.weight": "model-00003-of-00003.safetensors",
144
+ "model.layers.22.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
145
+ "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
146
+ "model.layers.22.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
147
+ "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
148
+ "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
149
+ "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
150
+ "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
151
+ "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
152
+ "model.layers.23.input_layernorm.weight": "model-00003-of-00003.safetensors",
153
+ "model.layers.23.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
154
+ "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
155
+ "model.layers.23.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
156
+ "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
157
+ "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
158
+ "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
159
+ "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
160
+ "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
161
+ "model.layers.24.input_layernorm.weight": "model-00003-of-00003.safetensors",
162
+ "model.layers.24.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
163
+ "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
164
+ "model.layers.24.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
165
+ "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
166
+ "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
167
+ "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
168
+ "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
169
+ "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
170
+ "model.layers.25.input_layernorm.weight": "model-00003-of-00003.safetensors",
171
+ "model.layers.25.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
172
+ "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
173
+ "model.layers.25.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
174
+ "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
175
+ "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
176
+ "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
177
+ "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
178
+ "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
179
+ "model.layers.26.input_layernorm.weight": "model-00003-of-00003.safetensors",
180
+ "model.layers.26.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
181
+ "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
182
+ "model.layers.26.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
183
+ "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
184
+ "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
185
+ "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
186
+ "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
187
+ "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
188
+ "model.layers.27.input_layernorm.weight": "model-00003-of-00003.safetensors",
189
+ "model.layers.27.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
190
+ "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
191
+ "model.layers.27.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
192
+ "model.layers.27.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
193
+ "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
194
+ "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
195
+ "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
196
+ "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
197
+ "model.layers.28.input_layernorm.weight": "model-00003-of-00003.safetensors",
198
+ "model.layers.28.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
199
+ "model.layers.28.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
200
+ "model.layers.28.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
201
+ "model.layers.28.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
202
+ "model.layers.28.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
203
+ "model.layers.28.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
204
+ "model.layers.28.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
205
+ "model.layers.28.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
206
+ "model.layers.29.input_layernorm.weight": "model-00003-of-00003.safetensors",
207
+ "model.layers.29.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
208
+ "model.layers.29.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
209
+ "model.layers.29.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
210
+ "model.layers.29.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
211
+ "model.layers.29.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
212
+ "model.layers.29.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
213
+ "model.layers.29.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
214
+ "model.layers.29.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
215
+ "model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors",
216
+ "model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
217
+ "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
218
+ "model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
219
+ "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
220
+ "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
221
+ "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
222
+ "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
223
+ "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
224
+ "model.layers.30.input_layernorm.weight": "model-00003-of-00003.safetensors",
225
+ "model.layers.30.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
226
+ "model.layers.30.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
227
+ "model.layers.30.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
228
+ "model.layers.30.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
229
+ "model.layers.30.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
230
+ "model.layers.30.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
231
+ "model.layers.30.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
232
+ "model.layers.30.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
233
+ "model.layers.31.input_layernorm.weight": "model-00003-of-00003.safetensors",
234
+ "model.layers.31.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
235
+ "model.layers.31.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
236
+ "model.layers.31.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
237
+ "model.layers.31.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
238
+ "model.layers.31.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
239
+ "model.layers.31.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
240
+ "model.layers.31.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
241
+ "model.layers.31.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
242
+ "model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors",
243
+ "model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
244
+ "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
245
+ "model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
246
+ "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
247
+ "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
248
+ "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
249
+ "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
250
+ "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
251
+ "model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors",
252
+ "model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
253
+ "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
254
+ "model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
255
+ "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
256
+ "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
257
+ "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
258
+ "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
259
+ "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
260
+ "model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors",
261
+ "model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
262
+ "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
263
+ "model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
264
+ "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
265
+ "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
266
+ "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
267
+ "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
268
+ "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
269
+ "model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors",
270
+ "model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
271
+ "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
272
+ "model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
273
+ "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
274
+ "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
275
+ "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
276
+ "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
277
+ "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
278
+ "model.layers.8.input_layernorm.weight": "model-00001-of-00003.safetensors",
279
+ "model.layers.8.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
280
+ "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
281
+ "model.layers.8.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
282
+ "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
283
+ "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
284
+ "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
285
+ "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
286
+ "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
287
+ "model.layers.9.input_layernorm.weight": "model-00001-of-00003.safetensors",
288
+ "model.layers.9.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
289
+ "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
290
+ "model.layers.9.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
291
+ "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
292
+ "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
293
+ "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
294
+ "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
295
+ "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
296
+ "model.norm.weight": "model-00003-of-00003.safetensors"
297
+ }
298
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "</s>",
17
+ "unk_token": {
18
+ "content": "<unk>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": null,
5
+ "added_tokens_decoder": {
6
+ "0": {
7
+ "content": "<unk>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": true
13
+ },
14
+ "1": {
15
+ "content": "<s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": true
21
+ },
22
+ "2": {
23
+ "content": "</s>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": true
29
+ }
30
+ },
31
+ "additional_special_tokens": [],
32
+ "bos_token": "<s>",
33
+ "clean_up_tokenization_spaces": false,
34
+ "eos_token": "</s>",
35
+ "extra_special_tokens": {},
36
+ "legacy": false,
37
+ "model_max_length": 1000000000000000019884624838656,
38
+ "pad_token": "</s>",
39
+ "sp_model_kwargs": {},
40
+ "spaces_between_special_tokens": false,
41
+ "tokenizer_class": "LlamaTokenizerFast",
42
+ "unk_token": "<unk>",
43
+ "use_default_system_prompt": false
44
+ }