amirali1985 commited on
Commit
30d4af3
·
verified ·
1 Parent(s): ae342ff

Upload add_sub_sorl_v1_abs10_K1_50K

Browse files
add_sub_sorl_v1_abs10_K1_50K/config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "SorlModelWrapper"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": null,
8
+ "dtype": "float32",
9
+ "eos_token_id": null,
10
+ "head_dim": 128,
11
+ "hidden_act": "silu",
12
+ "hidden_size": 510,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 2040,
15
+ "layer_types": [
16
+ "full_attention",
17
+ "full_attention"
18
+ ],
19
+ "max_position_embeddings": 128,
20
+ "max_window_layers": 28,
21
+ "model_type": "qwen3",
22
+ "num_attention_heads": 3,
23
+ "num_hidden_layers": 2,
24
+ "num_key_value_heads": 3,
25
+ "pad_token_id": null,
26
+ "rms_norm_eps": 1e-06,
27
+ "rope_parameters": {
28
+ "rope_theta": 10000.0,
29
+ "rope_type": "default"
30
+ },
31
+ "sliding_window": null,
32
+ "tie_word_embeddings": false,
33
+ "transformers_version": "5.5.0",
34
+ "use_cache": true,
35
+ "use_sliding_window": false,
36
+ "vocab_size": 151654
37
+ }
add_sub_sorl_v1_abs10_K1_50K/generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "output_attentions": false,
4
+ "output_hidden_states": false,
5
+ "transformers_version": "5.5.0",
6
+ "use_cache": true
7
+ }
add_sub_sorl_v1_abs10_K1_50K/metrics.json ADDED
@@ -0,0 +1,2257 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "history": {
3
+ "step": [
4
+ 50,
5
+ 100,
6
+ 150,
7
+ 200,
8
+ 250,
9
+ 300,
10
+ 350,
11
+ 400,
12
+ 450,
13
+ 500,
14
+ 550,
15
+ 600,
16
+ 650,
17
+ 700,
18
+ 750,
19
+ 832,
20
+ 882,
21
+ 932,
22
+ 982,
23
+ 1032,
24
+ 1082,
25
+ 1132,
26
+ 1182,
27
+ 1232,
28
+ 1282,
29
+ 1332,
30
+ 1382,
31
+ 1432,
32
+ 1482,
33
+ 1532,
34
+ 1614,
35
+ 1664,
36
+ 1714,
37
+ 1764,
38
+ 1814,
39
+ 1864,
40
+ 1914,
41
+ 1964,
42
+ 2014,
43
+ 2064,
44
+ 2114,
45
+ 2164,
46
+ 2214,
47
+ 2264,
48
+ 2314,
49
+ 2396,
50
+ 2446,
51
+ 2496,
52
+ 2546,
53
+ 2596,
54
+ 2646,
55
+ 2696,
56
+ 2746,
57
+ 2796,
58
+ 2846,
59
+ 2896,
60
+ 2946,
61
+ 2996,
62
+ 3046,
63
+ 3096,
64
+ 3178,
65
+ 3228,
66
+ 3278,
67
+ 3328,
68
+ 3378,
69
+ 3428,
70
+ 3478,
71
+ 3528,
72
+ 3578,
73
+ 3628,
74
+ 3678,
75
+ 3728,
76
+ 3778,
77
+ 3828,
78
+ 3878,
79
+ 3960,
80
+ 4010,
81
+ 4060,
82
+ 4110,
83
+ 4160,
84
+ 4210,
85
+ 4260,
86
+ 4310,
87
+ 4360,
88
+ 4410,
89
+ 4460,
90
+ 4510,
91
+ 4560,
92
+ 4610,
93
+ 4660,
94
+ 4742,
95
+ 4792,
96
+ 4842,
97
+ 4892,
98
+ 4942,
99
+ 4992,
100
+ 5042,
101
+ 5092,
102
+ 5142,
103
+ 5192,
104
+ 5242,
105
+ 5292,
106
+ 5342,
107
+ 5392,
108
+ 5442,
109
+ 5524,
110
+ 5574,
111
+ 5624,
112
+ 5674,
113
+ 5724,
114
+ 5774,
115
+ 5824,
116
+ 5874,
117
+ 5924,
118
+ 5974,
119
+ 6024,
120
+ 6074,
121
+ 6124,
122
+ 6174,
123
+ 6224,
124
+ 6306,
125
+ 6356,
126
+ 6406,
127
+ 6456,
128
+ 6506,
129
+ 6556,
130
+ 6606,
131
+ 6656,
132
+ 6706,
133
+ 6756,
134
+ 6806,
135
+ 6856,
136
+ 6906,
137
+ 6956,
138
+ 7006,
139
+ 7088,
140
+ 7138,
141
+ 7188,
142
+ 7238,
143
+ 7288,
144
+ 7338,
145
+ 7388,
146
+ 7438,
147
+ 7488,
148
+ 7538,
149
+ 7588,
150
+ 7638,
151
+ 7688,
152
+ 7738,
153
+ 7788
154
+ ],
155
+ "loss": [
156
+ 7.291482925415039,
157
+ 4.033167839050293,
158
+ 3.4122180938720703,
159
+ 3.0403032302856445,
160
+ 3.001980781555176,
161
+ 2.982646942138672,
162
+ 2.7727580070495605,
163
+ 2.1000826358795166,
164
+ 1.4242171049118042,
165
+ -0.10749959945678711,
166
+ -2.012298583984375,
167
+ -4.46280574798584,
168
+ -6.6882829666137695,
169
+ -7.129601955413818,
170
+ -7.274585723876953,
171
+ -6.2049970626831055,
172
+ -5.417169570922852,
173
+ -4.86923885345459,
174
+ -4.55937385559082,
175
+ -3.540888547897339,
176
+ -4.229403972625732,
177
+ -3.292853832244873,
178
+ -2.741316795349121,
179
+ -3.3925161361694336,
180
+ -2.6337685585021973,
181
+ -2.5086615085601807,
182
+ -2.414508581161499,
183
+ -2.070335865020752,
184
+ -2.396928310394287,
185
+ -1.793986439704895,
186
+ -0.9649119973182678,
187
+ -0.9085626006126404,
188
+ -0.6782669425010681,
189
+ -0.48969733715057373,
190
+ -0.37728095054626465,
191
+ -0.5114427208900452,
192
+ -0.3232823610305786,
193
+ -0.42648670077323914,
194
+ -0.7195917963981628,
195
+ -0.744903028011322,
196
+ -0.6740866899490356,
197
+ -0.33350852131843567,
198
+ -0.27009108662605286,
199
+ -0.26920798420906067,
200
+ -0.22950142621994019,
201
+ -0.26204606890678406,
202
+ -0.29803380370140076,
203
+ -0.30089274048805237,
204
+ -0.17959536612033844,
205
+ -0.2130807489156723,
206
+ -0.26070791482925415,
207
+ -0.23502995073795319,
208
+ -0.14740601181983948,
209
+ -0.12747667729854584,
210
+ -0.16737529635429382,
211
+ -0.2518279254436493,
212
+ -0.1113758385181427,
213
+ -0.09704112261533737,
214
+ 0.016159119084477425,
215
+ -0.11811869591474533,
216
+ -0.1814730316400528,
217
+ -0.016289565712213516,
218
+ -0.1350007951259613,
219
+ -0.2632235884666443,
220
+ -0.07215753197669983,
221
+ -0.06786702573299408,
222
+ -0.05473274365067482,
223
+ -0.11306267976760864,
224
+ -0.008685859851539135,
225
+ -0.26242750883102417,
226
+ -0.03735169768333435,
227
+ -0.01853231154382229,
228
+ -0.07519838958978653,
229
+ -0.0790640115737915,
230
+ -0.06906218081712723,
231
+ -0.039443038403987885,
232
+ -0.0738786831498146,
233
+ -0.044155001640319824,
234
+ -0.07805009186267853,
235
+ -0.04168146848678589,
236
+ -0.12484845519065857,
237
+ -0.051792487502098083,
238
+ -0.04258032515645027,
239
+ -0.04226825386285782,
240
+ -0.07702398300170898,
241
+ -0.058789029717445374,
242
+ -0.03826560452580452,
243
+ -0.018153337761759758,
244
+ -0.021531714126467705,
245
+ -0.01795976608991623,
246
+ -0.03070964477956295,
247
+ -0.006802724674344063,
248
+ -0.003459150902926922,
249
+ 0.0014321315102279186,
250
+ -0.022512346506118774,
251
+ -0.15685448050498962,
252
+ 0.0004717526026070118,
253
+ -0.08908119797706604,
254
+ -0.006331631913781166,
255
+ -0.055951304733753204,
256
+ -0.08887043595314026,
257
+ 0.40162384510040283,
258
+ -0.007660364266484976,
259
+ -0.0002518543042242527,
260
+ -0.034914687275886536,
261
+ 0.008662399835884571,
262
+ -0.006906072609126568,
263
+ -0.008926203474402428,
264
+ -0.053697872906923294,
265
+ -0.04174011945724487,
266
+ -0.00622951053082943,
267
+ -0.03008740395307541,
268
+ 0.0052336109802126884,
269
+ -0.09772835671901703,
270
+ -0.02276228368282318,
271
+ 0.0019192327745258808,
272
+ 0.004020964726805687,
273
+ 0.0033772005699574947,
274
+ 0.0018916467670351267,
275
+ -0.00820618774741888,
276
+ 0.004954045172780752,
277
+ -0.04910368472337723,
278
+ -0.014934251084923744,
279
+ -0.0010213176719844341,
280
+ -0.01465108897536993,
281
+ 0.006254756823182106,
282
+ 0.004511718172580004,
283
+ -0.14273616671562195,
284
+ 0.005409314297139645,
285
+ -0.026461634784936905,
286
+ 0.0013872201088815928,
287
+ 0.007202650420367718,
288
+ 0.007318790070712566,
289
+ 0.005063582677394152,
290
+ 0.0029275300912559032,
291
+ 0.007748285308480263,
292
+ 0.0013876920565962791,
293
+ 0.0028375410474836826,
294
+ 0.009077148512005806,
295
+ 0.0024727603886276484,
296
+ 0.0072018057107925415,
297
+ 0.0035855781752616167,
298
+ 0.004427475389093161,
299
+ 0.005288178101181984,
300
+ 0.006763830315321684,
301
+ 0.007230003830045462,
302
+ 0.006502293515950441,
303
+ 0.006789479870349169,
304
+ 0.0021875742822885513,
305
+ 0.0073264106176793575
306
+ ],
307
+ "base_loss": [
308
+ 6.516536235809326,
309
+ 2.4835619926452637,
310
+ 1.899435043334961,
311
+ 1.8553259372711182,
312
+ 1.9413520097732544,
313
+ 1.8268587589263916,
314
+ 1.7858167886734009,
315
+ 1.7588903903961182,
316
+ 1.794765830039978,
317
+ 1.750693440437317,
318
+ 1.748929738998413,
319
+ 1.68966543674469,
320
+ 1.7001359462738037,
321
+ 1.5033057928085327,
322
+ 1.4678022861480713,
323
+ 1.2599419355392456,
324
+ 1.0703829526901245,
325
+ 0.9103147387504578,
326
+ 0.9035982489585876,
327
+ 0.7412249445915222,
328
+ 0.7162189483642578,
329
+ 0.6284891963005066,
330
+ 0.5025827288627625,
331
+ 0.5255685448646545,
332
+ 0.44132092595100403,
333
+ 0.403102844953537,
334
+ 0.33637675642967224,
335
+ 0.3013179302215576,
336
+ 0.3363364040851593,
337
+ 0.25080448389053345,
338
+ 0.13762901723384857,
339
+ 0.13802115619182587,
340
+ 0.09735461324453354,
341
+ 0.07956259697675705,
342
+ 0.06039976328611374,
343
+ 0.09545262902975082,
344
+ 0.04513740912079811,
345
+ 0.05585834011435509,
346
+ 0.08786828815937042,
347
+ 0.09875041991472244,
348
+ 0.08633076399564743,
349
+ 0.05746101960539818,
350
+ 0.035193223506212234,
351
+ 0.03549731522798538,
352
+ 0.02876146510243416,
353
+ 0.032679636031389236,
354
+ 0.04298492148518562,
355
+ 0.03797732666134834,
356
+ 0.024759531021118164,
357
+ 0.0304957814514637,
358
+ 0.03193579241633415,
359
+ 0.027899155393242836,
360
+ 0.01801885850727558,
361
+ 0.017517048865556717,
362
+ 0.019884327426552773,
363
+ 0.03035196103155613,
364
+ 0.01430182158946991,
365
+ 0.011786989867687225,
366
+ 0.018495408818125725,
367
+ 0.014813835732638836,
368
+ 0.021165570244193077,
369
+ 0.00257376697845757,
370
+ 0.01620938815176487,
371
+ 0.030093954876065254,
372
+ 0.008949460461735725,
373
+ 0.008293701335787773,
374
+ 0.0069602918811142445,
375
+ 0.013352221809327602,
376
+ 0.00192145979963243,
377
+ 0.03061601147055626,
378
+ 0.005475928541272879,
379
+ 0.00301953568123281,
380
+ 0.00922973733395338,
381
+ 0.01043914258480072,
382
+ 0.008999384008347988,
383
+ 0.005764245521277189,
384
+ 0.009663080796599388,
385
+ 0.005371266044676304,
386
+ 0.009819733910262585,
387
+ 0.006027544848620892,
388
+ 0.014493056572973728,
389
+ 0.006732997950166464,
390
+ 0.00728581240400672,
391
+ 0.006032108794897795,
392
+ 0.009137012995779514,
393
+ 0.007936102338135242,
394
+ 0.005118044558912516,
395
+ 0.0029691599775105715,
396
+ 0.003965041600167751,
397
+ 0.003704359754920006,
398
+ 0.004141169600188732,
399
+ 0.0015686674742028117,
400
+ 0.001501810154877603,
401
+ 0.0002402652899036184,
402
+ 0.0029597077518701553,
403
+ 0.018230454996228218,
404
+ 0.0006747310981154442,
405
+ 0.010286212898790836,
406
+ 0.001309316256083548,
407
+ 0.00767873739823699,
408
+ 0.010770417749881744,
409
+ 0.0008323562215082347,
410
+ 0.0013423433993011713,
411
+ 0.0005205980269238353,
412
+ 0.0044253370724618435,
413
+ 9.446766489418224e-05,
414
+ 0.0013240036787465215,
415
+ 0.0024794754572212696,
416
+ 0.007238837890326977,
417
+ 0.005449187941849232,
418
+ 0.0012252583401277661,
419
+ 0.004047301132231951,
420
+ 6.462168676080182e-05,
421
+ 0.0121733034029603,
422
+ 0.0035667994525283575,
423
+ 0.0005734098376706243,
424
+ 0.0004332214593887329,
425
+ 0.0005883773555979133,
426
+ 0.000351788941770792,
427
+ 0.0014397574122995138,
428
+ 0.00022652784537058324,
429
+ 0.006164551712572575,
430
+ 0.001950864098034799,
431
+ 0.0008269333047792315,
432
+ 0.002537141088396311,
433
+ 2.746866266534198e-05,
434
+ 0.00023421202786266804,
435
+ 0.016300244256854057,
436
+ 0.0003031859523616731,
437
+ 0.0032388202380388975,
438
+ 0.00036090545472688973,
439
+ 0.00011428723519202322,
440
+ 0.00010272496001562104,
441
+ 0.00026342846103943884,
442
+ 7.695650856476277e-05,
443
+ 0.00025681612896732986,
444
+ 0.0010053400183096528,
445
+ 9.518813021713868e-05,
446
+ 4.364967389847152e-05,
447
+ 9.891855006571859e-05,
448
+ 3.729294257936999e-05,
449
+ 8.083302236627787e-05,
450
+ 0.00014823059609625489,
451
+ 3.982062844443135e-05,
452
+ 4.8367975978180766e-05,
453
+ 3.223081512260251e-05,
454
+ 5.926141238887794e-05,
455
+ 0.00012397172395139933,
456
+ 4.300850923755206e-05,
457
+ 2.1748028302681632e-05
458
+ ],
459
+ "info_loss": [
460
+ -0.39558982849121094,
461
+ -0.04619121551513672,
462
+ -0.03844738006591797,
463
+ -0.06993865966796875,
464
+ -0.0818336009979248,
465
+ -0.071785569190979,
466
+ -0.08853816986083984,
467
+ -0.15343570709228516,
468
+ -0.2237008810043335,
469
+ -0.37132859230041504,
470
+ -0.5553083419799805,
471
+ -0.78294438123703,
472
+ -0.9912639856338501,
473
+ -1.0034464597702026,
474
+ -1.0093202590942383,
475
+ -0.8740698099136353,
476
+ -0.7721136212348938,
477
+ -0.6993796825408936,
478
+ -0.6589730381965637,
479
+ -0.5287246108055115,
480
+ -0.5838943719863892,
481
+ -0.4729439616203308,
482
+ -0.3963608741760254,
483
+ -0.4527805745601654,
484
+ -0.36455219984054565,
485
+ -0.3377186357975006,
486
+ -0.31269681453704834,
487
+ -0.27766433358192444,
488
+ -0.3063477575778961,
489
+ -0.23108206689357758,
490
+ -0.13183042407035828,
491
+ -0.12639640271663666,
492
+ -0.09110953658819199,
493
+ -0.06630340218544006,
494
+ -0.0564069002866745,
495
+ -0.07329370081424713,
496
+ -0.043559543788433075,
497
+ -0.05424797162413597,
498
+ -0.08596842736005783,
499
+ -0.0910455510020256,
500
+ -0.08357047289609909,
501
+ -0.04377105087041855,
502
+ -0.03403441235423088,
503
+ -0.03457903489470482,
504
+ -0.028307367116212845,
505
+ -0.03156619891524315,
506
+ -0.037180345505476,
507
+ -0.03508739173412323,
508
+ -0.02273467183113098,
509
+ -0.02561572939157486,
510
+ -0.031209608539938927,
511
+ -0.02763480320572853,
512
+ -0.017675096169114113,
513
+ -0.016021262854337692,
514
+ -0.01964615099132061,
515
+ -0.029404595494270325,
516
+ -0.0141890374943614,
517
+ -0.01162042934447527,
518
+ -0.001841643825173378,
519
+ -0.014748577028512955,
520
+ -0.021103857085108757,
521
+ -0.0025317047256976366,
522
+ -0.016158735379576683,
523
+ -0.030070843175053596,
524
+ -0.008906206116080284,
525
+ -0.0082675376906991,
526
+ -0.006873094942420721,
527
+ -0.013306264765560627,
528
+ -0.0018865098245441914,
529
+ -0.030588379129767418,
530
+ -0.005302963312715292,
531
+ -0.0029850159771740437,
532
+ -0.00920331384986639,
533
+ -0.010163646191358566,
534
+ -0.00896731298416853,
535
+ -0.005701817572116852,
536
+ -0.009641165845096111,
537
+ -0.005342759657651186,
538
+ -0.009793885052204132,
539
+ -0.00600464129820466,
540
+ -0.014428073540329933,
541
+ -0.006682424806058407,
542
+ -0.005700113717466593,
543
+ -0.00503776129335165,
544
+ -0.009123606607317924,
545
+ -0.007870280183851719,
546
+ -0.005100195296108723,
547
+ -0.002949395217001438,
548
+ -0.0039481655694544315,
549
+ -0.003691060934215784,
550
+ -0.00412713410332799,
551
+ -0.0015408744802698493,
552
+ -0.0014740722253918648,
553
+ -0.00021267203555908054,
554
+ -0.002890156116336584,
555
+ -0.018215106800198555,
556
+ -0.0006676774355582893,
557
+ -0.0102800028398633,
558
+ -0.0013014374999329448,
559
+ -0.007669177837669849,
560
+ -0.010761483572423458,
561
+ 0.03959464281797409,
562
+ -0.0013354456750676036,
563
+ -0.0004929776769131422,
564
+ -0.0044188955798745155,
565
+ -9.018523996928707e-05,
566
+ -0.0013183815171942115,
567
+ -0.0024549937807023525,
568
+ -0.00722320144996047,
569
+ -0.005443317350000143,
570
+ -0.001220099744386971,
571
+ -0.004042269662022591,
572
+ -6.167100218590349e-05,
573
+ -0.012167608365416527,
574
+ -0.0035586960148066282,
575
+ -0.0005662220646627247,
576
+ -0.0004291559162084013,
577
+ -0.0005853284965269268,
578
+ -0.0003490628441795707,
579
+ -0.0014374355087056756,
580
+ -0.00022408299264498055,
581
+ -0.00616141501814127,
582
+ -0.001947026583366096,
583
+ -0.0008238331065513194,
584
+ -0.002534637926146388,
585
+ -2.5075705707422458e-05,
586
+ -0.00023122489801608026,
587
+ -0.0162954144179821,
588
+ -0.0002997294650413096,
589
+ -0.0032353017013520002,
590
+ -0.00035792760900221765,
591
+ -0.00011160584108438343,
592
+ -0.00010035171726485714,
593
+ -0.0002614471595734358,
594
+ -7.487701077479869e-05,
595
+ -0.0002547329058870673,
596
+ -0.0010033685248345137,
597
+ -9.309347660746425e-05,
598
+ -4.177000664640218e-05,
599
+ -9.644340752856806e-05,
600
+ -3.56615346390754e-05,
601
+ -7.872026617405936e-05,
602
+ -0.00014596803521271795,
603
+ -3.823072984232567e-05,
604
+ -4.660113700083457e-05,
605
+ -3.038973045477178e-05,
606
+ -5.761536885984242e-05,
607
+ -0.0001223341969307512,
608
+ -4.1437237086938694e-05,
609
+ -2.0190593204461038e-05
610
+ ],
611
+ "abs_loss": [
612
+ 2.103454351425171,
613
+ 1.8856569528579712,
614
+ 1.8366988897323608,
615
+ 1.853995680809021,
616
+ 1.8650015592575073,
617
+ 1.8378944396972656,
618
+ 1.8304977416992188,
619
+ 1.8361488580703735,
620
+ 1.821694016456604,
621
+ 1.807782769203186,
622
+ 1.6753603219985962,
623
+ 1.4515177011489868,
624
+ 1.322694182395935,
625
+ 1.1485607624053955,
626
+ 1.1458325386047363,
627
+ 0.9455111622810364,
628
+ 0.8843372464179993,
629
+ 0.8360934853553772,
630
+ 0.7149937748908997,
631
+ 0.5886721611022949,
632
+ 0.5124597549438477,
633
+ 0.43022820353507996,
634
+ 0.3434896767139435,
635
+ 0.29655787348747253,
636
+ 0.26246532797813416,
637
+ 0.22259850800037384,
638
+ 0.19756682217121124,
639
+ 0.22054968774318695,
640
+ 0.17242246866226196,
641
+ 0.1368202120065689,
642
+ 0.1471979171037674,
643
+ 0.1158660277724266,
644
+ 0.10501883178949356,
645
+ 0.08231943845748901,
646
+ 0.08551102876663208,
647
+ 0.09424110502004623,
648
+ 0.06484600156545639,
649
+ 0.05951952934265137,
650
+ 0.04611345753073692,
651
+ 0.05700722336769104,
652
+ 0.045443009585142136,
653
+ 0.052510228008031845,
654
+ 0.04553452134132385,
655
+ 0.027561327442526817,
656
+ 0.03692733123898506,
657
+ 0.029859310016036034,
658
+ 0.01610124669969082,
659
+ 0.01593116857111454,
660
+ 0.012548821978271008,
661
+ 0.024183565750718117,
662
+ 0.015415686182677746,
663
+ 0.008188403211534023,
664
+ 0.013608518056571484,
665
+ 0.02348317764699459,
666
+ 0.02088686265051365,
667
+ 0.0142488032579422,
668
+ 0.010507809929549694,
669
+ 0.011757518164813519,
670
+ 0.007792810443788767,
671
+ 0.007188592571765184,
672
+ 0.012610410340130329,
673
+ 0.009439573623239994,
674
+ 0.0064491513185203075,
675
+ 0.006307272240519524,
676
+ 0.0056409514509141445,
677
+ 0.006569093093276024,
678
+ 0.007965529337525368,
679
+ 0.008747314102947712,
680
+ 0.010841268114745617,
681
+ 0.008278321474790573,
682
+ 0.009732459671795368,
683
+ 0.010417111217975616,
684
+ 0.008557966910302639,
685
+ 0.005075481254607439,
686
+ 0.001646698801778257,
687
+ 0.00539746368303895,
688
+ 0.002312362426891923,
689
+ 0.004107427317649126,
690
+ 0.008200561627745628,
691
+ 0.0017948299646377563,
692
+ 0.0003762578417081386,
693
+ 0.002077900804579258,
694
+ 0.006949333474040031,
695
+ 0.004756436217576265,
696
+ 0.002833310514688492,
697
+ 0.0020073887426406145,
698
+ 0.006080996710807085,
699
+ 0.0006730963359586895,
700
+ 0.001914774184115231,
701
+ 0.02848239243030548,
702
+ 0.008244752883911133,
703
+ 0.003306052880361676,
704
+ 0.005812538787722588,
705
+ 0.0005439592059701681,
706
+ 0.0033859622199088335,
707
+ 0.00019894744036719203,
708
+ 0.003157586557790637,
709
+ 0.0017652591923251748,
710
+ 0.0005068835453130305,
711
+ 0.0009860001737251878,
712
+ 0.0002652962866704911,
713
+ 0.0014243596233427525,
714
+ 0.00014402759552467614,
715
+ 0.001731699681840837,
716
+ 0.0007810410461388528,
717
+ 0.0018737445352599025,
718
+ 0.004320906475186348,
719
+ 0.007283887360244989,
720
+ 0.0010027083335444331,
721
+ 0.002329479670152068,
722
+ 7.454246951965615e-05,
723
+ 0.0003903764591086656,
724
+ 0.0004755467234645039,
725
+ 0.00022724166046828032,
726
+ 8.931435877457261e-05,
727
+ 0.0002792776213027537,
728
+ 0.0016023950884118676,
729
+ 9.345952275907621e-05,
730
+ 0.00014107099559623748,
731
+ 0.00021528666547965258,
732
+ 0.00014757552708033472,
733
+ 0.00025952685973607004,
734
+ 0.00019442291522864252,
735
+ 0.00011387479025870562,
736
+ 5.852747199242003e-05,
737
+ 0.0016362796304747462,
738
+ 5.2294944907771423e-05,
739
+ 0.00010021971684182063,
740
+ 8.62984816194512e-05,
741
+ 0.00018095503037329763,
742
+ 0.00010219896648777649,
743
+ 0.00011972735956078395,
744
+ 6.06870926276315e-05,
745
+ 9.31524918996729e-05,
746
+ 4.9771664635045454e-05,
747
+ 3.388823461136781e-05,
748
+ 3.732774348463863e-05,
749
+ 4.141781755606644e-05,
750
+ 2.3655571567360312e-05,
751
+ 0.00012349901953712106,
752
+ 2.4177439627237618e-05,
753
+ 3.144523725495674e-05,
754
+ 3.530609319568612e-05,
755
+ 0.00014106860908214003,
756
+ 4.612208067555912e-05,
757
+ 5.547635373659432e-05,
758
+ 5.11908256157767e-05,
759
+ 3.4463100746506825e-05,
760
+ 4.004925722256303e-05,
761
+ 4.555788837024011e-05
762
+ ],
763
+ "zipf_loss": [
764
+ 4.520499229431152,
765
+ 1.822952151298523,
766
+ 1.7135869264602661,
767
+ 1.6989643573760986,
768
+ 1.6924647092819214,
769
+ 1.689854383468628,
770
+ 1.6892731189727783,
771
+ 1.69193434715271,
772
+ 1.6842906475067139,
773
+ 1.6743146181106567,
774
+ 1.6243189573287964,
775
+ 1.531821370124817,
776
+ 1.3919508457183838,
777
+ 1.2867016792297363,
778
+ 1.236231803894043,
779
+ 1.1812078952789307,
780
+ 1.1451495885849,
781
+ 1.1306335926055908,
782
+ 1.0552587509155273,
783
+ 0.9462655186653137,
784
+ 0.8420745134353638,
785
+ 0.7650742530822754,
786
+ 0.6853601336479187,
787
+ 0.5800656080245972,
788
+ 0.5441861152648926,
789
+ 0.44316205382347107,
790
+ 0.3563261330127716,
791
+ 0.3829346299171448,
792
+ 0.31297069787979126,
793
+ 0.2523476183414459,
794
+ 0.20104341208934784,
795
+ 0.2057938128709793,
796
+ 0.12497197836637497,
797
+ 0.08554214239120483,
798
+ 0.11783720552921295,
799
+ 0.11661753803491592,
800
+ 0.06069108471274376,
801
+ 0.05418270081281662,
802
+ 0.04761284589767456,
803
+ 0.06110139936208725,
804
+ 0.07074299454689026,
805
+ 0.041489966213703156,
806
+ 0.030506331473588943,
807
+ 0.03832893818616867,
808
+ 0.021118007600307465,
809
+ 0.017950356006622314,
810
+ 0.029174601659178734,
811
+ 0.010410734452307224,
812
+ 0.021736932918429375,
813
+ 0.01016240008175373,
814
+ 0.017910826951265335,
815
+ 0.012600082904100418,
816
+ 0.00996524840593338,
817
+ 0.012870588339865208,
818
+ 0.007113197818398476,
819
+ 0.010441196151077747,
820
+ 0.015161935240030289,
821
+ 0.006200422998517752,
822
+ 0.015300867147743702,
823
+ 0.013834364712238312,
824
+ 0.0071389381773769855,
825
+ 0.005509757436811924,
826
+ 0.009732245467603207,
827
+ 0.006760149262845516,
828
+ 0.007390973158180714,
829
+ 0.005857738666236401,
830
+ 0.006241363473236561,
831
+ 0.005773015320301056,
832
+ 0.007173650898039341,
833
+ 0.012012462131679058,
834
+ 0.009228762239217758,
835
+ 0.007256601005792618,
836
+ 0.006749220658093691,
837
+ 0.011625763028860092,
838
+ 0.011446894146502018,
839
+ 0.011271147057414055,
840
+ 0.01263866014778614,
841
+ 0.0034905869979411364,
842
+ 0.00924896914511919,
843
+ 0.012157914228737354,
844
+ 0.0049015856347978115,
845
+ 0.008090977557003498,
846
+ 0.006440065801143646,
847
+ 0.0016016059089452028,
848
+ 0.004791740328073502,
849
+ 0.011776924133300781,
850
+ 0.007010199595242739,
851
+ 0.008304143324494362,
852
+ 0.013793421909213066,
853
+ 0.01239824015647173,
854
+ 0.005596051458269358,
855
+ 0.006706747226417065,
856
+ 0.0091985072940588,
857
+ 0.0032641906291246414,
858
+ 0.0030909115448594093,
859
+ 0.007046230137348175,
860
+ 0.006158037111163139,
861
+ 0.0032560895197093487,
862
+ 0.00532273855060339,
863
+ 0.012963132932782173,
864
+ 0.00794745422899723,
865
+ 0.004702637903392315,
866
+ 0.004337346646934748,
867
+ 0.00398415420204401,
868
+ 0.004770827479660511,
869
+ 0.009282410144805908,
870
+ 0.004521648399531841,
871
+ 0.012415871024131775,
872
+ 0.011195037513971329,
873
+ 0.007010922767221928,
874
+ 0.00473877415060997,
875
+ 0.006248955614864826,
876
+ 0.005738144740462303,
877
+ 0.011751700192689896,
878
+ 0.00924894493073225,
879
+ 0.0069801160134375095,
880
+ 0.007719063200056553,
881
+ 0.008632762357592583,
882
+ 0.0050163790583610535,
883
+ 0.004706881009042263,
884
+ 0.006953589618206024,
885
+ 0.006319958716630936,
886
+ 0.002565708477050066,
887
+ 0.006378693040460348,
888
+ 0.00815229769796133,
889
+ 0.006314417347311974,
890
+ 0.00658452557399869,
891
+ 0.00390771497040987,
892
+ 0.008094793185591698,
893
+ 0.002634467789903283,
894
+ 0.004595370963215828,
895
+ 0.008192448876798153,
896
+ 0.008213513530790806,
897
+ 0.007405310403555632,
898
+ 0.003594366367906332,
899
+ 0.010035409592092037,
900
+ 0.010412304662168026,
901
+ 0.0036691459827125072,
902
+ 0.009448833763599396,
903
+ 0.0033259259071201086,
904
+ 0.007518710568547249,
905
+ 0.004288803320378065,
906
+ 0.005735394544899464,
907
+ 0.005616557784378529,
908
+ 0.007176861632615328,
909
+ 0.007496122736483812,
910
+ 0.0070140669122338295,
911
+ 0.0078854039311409,
912
+ 0.0025549333076924086,
913
+ 0.007502012886106968
914
+ ],
915
+ "denoise_loss": [],
916
+ "ortho_loss": [
917
+ 0.2111450433731079,
918
+ 0.1435524970293045,
919
+ 0.06681142747402191,
920
+ 0.052996307611465454,
921
+ 0.05158527195453644,
922
+ 0.052943676710128784,
923
+ 0.05298374965786934,
924
+ 0.06948061287403107,
925
+ 0.0728166475892067,
926
+ 0.080330990254879,
927
+ 0.09410754591226578,
928
+ 0.12913890182971954,
929
+ 0.1634664088487625,
930
+ 0.17435631155967712,
931
+ 0.18374793231487274,
932
+ 0.19662483036518097,
933
+ 0.21952885389328003,
934
+ 0.23227910697460175,
935
+ 0.2511540353298187,
936
+ 0.24509447813034058,
937
+ 0.24284040927886963,
938
+ 0.24897213280200958,
939
+ 0.2506718933582306,
940
+ 0.2575362026691437,
941
+ 0.2686024606227875,
942
+ 0.2750362753868103,
943
+ 0.2782677114009857,
944
+ 0.2853957712650299,
945
+ 0.29871729016304016,
946
+ 0.30882352590560913,
947
+ 0.3340570032596588,
948
+ 0.32771769165992737,
949
+ 0.32639268040657043,
950
+ 0.3189693093299866,
951
+ 0.31750908493995667,
952
+ 0.32593634724617004,
953
+ 0.3260767161846161,
954
+ 0.3234666585922241,
955
+ 0.3184603154659271,
956
+ 0.31702885031700134,
957
+ 0.31575602293014526,
958
+ 0.3068729639053345,
959
+ 0.31276628375053406,
960
+ 0.29689764976501465,
961
+ 0.30121129751205444,
962
+ 0.3054710626602173,
963
+ 0.3092142939567566,
964
+ 0.3109016418457031,
965
+ 0.30108126997947693,
966
+ 0.2955969274044037,
967
+ 0.2946198880672455,
968
+ 0.29174482822418213,
969
+ 0.285043865442276,
970
+ 0.28630417585372925,
971
+ 0.2912791967391968,
972
+ 0.2771497368812561,
973
+ 0.2789520025253296,
974
+ 0.28473028540611267,
975
+ 0.28749844431877136,
976
+ 0.28734689950942993,
977
+ 0.28948697447776794,
978
+ 0.29193195700645447,
979
+ 0.2959407567977905,
980
+ 0.29702845215797424,
981
+ 0.2931220233440399,
982
+ 0.2979944050312042,
983
+ 0.297878235578537,
984
+ 0.30620715022087097,
985
+ 0.3166170120239258,
986
+ 0.3252231180667877,
987
+ 0.32210466265678406,
988
+ 0.32464829087257385,
989
+ 0.32355090975761414,
990
+ 0.3284517526626587,
991
+ 0.32824888825416565,
992
+ 0.32835543155670166,
993
+ 0.32803553342819214,
994
+ 0.3309395909309387,
995
+ 0.3305571675300598,
996
+ 0.3290838897228241,
997
+ 0.33054694533348083,
998
+ 0.32293567061424255,
999
+ 0.3265998661518097,
1000
+ 0.3343144953250885,
1001
+ 0.33344414830207825,
1002
+ 0.3310560882091522,
1003
+ 0.337158739566803,
1004
+ 0.33970019221305847,
1005
+ 0.3424912393093109,
1006
+ 0.34168195724487305,
1007
+ 0.34499648213386536,
1008
+ 0.3437992036342621,
1009
+ 0.346343994140625,
1010
+ 0.34596961736679077,
1011
+ 0.3452761769294739,
1012
+ 0.34677624702453613,
1013
+ 0.3484945297241211,
1014
+ 0.3473084568977356,
1015
+ 0.35560914874076843,
1016
+ 0.3561987280845642,
1017
+ 0.357473760843277,
1018
+ 0.35667869448661804,
1019
+ 0.35596200823783875,
1020
+ 0.3558987081050873,
1021
+ 0.3573693335056305,
1022
+ 0.3576008677482605,
1023
+ 0.355884313583374,
1024
+ 0.3569817543029785,
1025
+ 0.3577706217765808,
1026
+ 0.35831722617149353,
1027
+ 0.3573484718799591,
1028
+ 0.35760417580604553,
1029
+ 0.35809966921806335,
1030
+ 0.35827332735061646,
1031
+ 0.36096251010894775,
1032
+ 0.3610628843307495,
1033
+ 0.36021000146865845,
1034
+ 0.36042484641075134,
1035
+ 0.36087724566459656,
1036
+ 0.36099040508270264,
1037
+ 0.3603220582008362,
1038
+ 0.35995814204216003,
1039
+ 0.35976719856262207,
1040
+ 0.3594537079334259,
1041
+ 0.36095303297042847,
1042
+ 0.3614071309566498,
1043
+ 0.36208659410476685,
1044
+ 0.36218127608299255,
1045
+ 0.36316877603530884,
1046
+ 0.36409878730773926,
1047
+ 0.36422309279441833,
1048
+ 0.36380311846733093,
1049
+ 0.36408865451812744,
1050
+ 0.36326295137405396,
1051
+ 0.36311736702919006,
1052
+ 0.3631589412689209,
1053
+ 0.3635577857494354,
1054
+ 0.3654131591320038,
1055
+ 0.3654572069644928,
1056
+ 0.36536282300949097,
1057
+ 0.365400493144989,
1058
+ 0.3654620051383972,
1059
+ 0.3656059801578522,
1060
+ 0.3652297258377075,
1061
+ 0.36522072553634644,
1062
+ 0.3652559518814087,
1063
+ 0.36530348658561707,
1064
+ 0.36531174182891846,
1065
+ 0.36535152792930603,
1066
+ 0.3653797209262848
1067
+ ],
1068
+ "lr": [
1069
+ 7.840000000000001e-05,
1070
+ 8e-05,
1071
+ 8e-05,
1072
+ 8e-05,
1073
+ 8e-05,
1074
+ 8e-05,
1075
+ 8e-05,
1076
+ 8e-05,
1077
+ 8e-05,
1078
+ 8e-05,
1079
+ 8e-05,
1080
+ 8e-05,
1081
+ 8e-05,
1082
+ 8e-05,
1083
+ 8e-05,
1084
+ 8e-05,
1085
+ 8e-05,
1086
+ 8e-05,
1087
+ 8e-05,
1088
+ 8e-05,
1089
+ 8e-05,
1090
+ 8e-05,
1091
+ 8e-05,
1092
+ 8e-05,
1093
+ 8e-05,
1094
+ 8e-05,
1095
+ 8e-05,
1096
+ 8e-05,
1097
+ 8e-05,
1098
+ 8e-05,
1099
+ 8e-05,
1100
+ 8e-05,
1101
+ 8e-05,
1102
+ 8e-05,
1103
+ 8e-05,
1104
+ 8e-05,
1105
+ 8e-05,
1106
+ 8e-05,
1107
+ 8e-05,
1108
+ 8e-05,
1109
+ 8e-05,
1110
+ 8e-05,
1111
+ 8e-05,
1112
+ 8e-05,
1113
+ 8e-05,
1114
+ 8e-05,
1115
+ 8e-05,
1116
+ 8e-05,
1117
+ 8e-05,
1118
+ 8e-05,
1119
+ 8e-05,
1120
+ 8e-05,
1121
+ 8e-05,
1122
+ 8e-05,
1123
+ 8e-05,
1124
+ 8e-05,
1125
+ 8e-05,
1126
+ 8e-05,
1127
+ 8e-05,
1128
+ 8e-05,
1129
+ 8e-05,
1130
+ 8e-05,
1131
+ 8e-05,
1132
+ 8e-05,
1133
+ 8e-05,
1134
+ 8e-05,
1135
+ 8e-05,
1136
+ 8e-05,
1137
+ 8e-05,
1138
+ 8e-05,
1139
+ 8e-05,
1140
+ 8e-05,
1141
+ 8e-05,
1142
+ 8e-05,
1143
+ 8e-05,
1144
+ 8e-05,
1145
+ 8e-05,
1146
+ 8e-05,
1147
+ 8e-05,
1148
+ 8e-05,
1149
+ 8e-05,
1150
+ 8e-05,
1151
+ 8e-05,
1152
+ 8e-05,
1153
+ 8e-05,
1154
+ 8e-05,
1155
+ 8e-05,
1156
+ 8e-05,
1157
+ 8e-05,
1158
+ 8e-05,
1159
+ 7.932818532818534e-05,
1160
+ 7.816988416988418e-05,
1161
+ 7.701158301158302e-05,
1162
+ 7.585328185328185e-05,
1163
+ 7.469498069498071e-05,
1164
+ 7.353667953667954e-05,
1165
+ 7.237837837837838e-05,
1166
+ 7.122007722007721e-05,
1167
+ 7.006177606177606e-05,
1168
+ 6.890347490347492e-05,
1169
+ 6.774517374517375e-05,
1170
+ 6.65868725868726e-05,
1171
+ 6.542857142857144e-05,
1172
+ 6.427027027027027e-05,
1173
+ 6.311196911196911e-05,
1174
+ 6.121235521235521e-05,
1175
+ 6.0054054054054064e-05,
1176
+ 5.8895752895752895e-05,
1177
+ 5.773745173745175e-05,
1178
+ 5.6579150579150584e-05,
1179
+ 5.542084942084943e-05,
1180
+ 5.426254826254825e-05,
1181
+ 5.310424710424711e-05,
1182
+ 5.194594594594594e-05,
1183
+ 5.0787644787644786e-05,
1184
+ 4.9629343629343644e-05,
1185
+ 4.8471042471042475e-05,
1186
+ 4.7312741312741326e-05,
1187
+ 4.615444015444014e-05,
1188
+ 4.4996138996139e-05,
1189
+ 4.309652509652511e-05,
1190
+ 4.1938223938223946e-05,
1191
+ 4.07799227799228e-05,
1192
+ 3.962162162162162e-05,
1193
+ 3.846332046332047e-05,
1194
+ 3.73050193050193e-05,
1195
+ 3.6146718146718155e-05,
1196
+ 3.4988416988416986e-05,
1197
+ 3.383011583011584e-05,
1198
+ 3.267181467181467e-05,
1199
+ 3.151351351351352e-05,
1200
+ 3.0355212355212367e-05,
1201
+ 2.9196911196911198e-05,
1202
+ 2.8038610038610046e-05,
1203
+ 2.6880308880308876e-05,
1204
+ 2.4980694980694983e-05,
1205
+ 2.3822393822393838e-05,
1206
+ 2.266409266409267e-05,
1207
+ 2.1505791505791517e-05,
1208
+ 2.0347490347490348e-05,
1209
+ 1.9189189189189195e-05,
1210
+ 1.8030888030888026e-05,
1211
+ 1.6872586872586878e-05,
1212
+ 1.571428571428571e-05,
1213
+ 1.455598455598456e-05,
1214
+ 1.3397683397683389e-05,
1215
+ 1.223938223938224e-05,
1216
+ 1.1081081081081092e-05,
1217
+ 9.92277992277992e-06,
1218
+ 8.764478764478772e-06
1219
+ ],
1220
+ "emb_lr": [],
1221
+ "eval_step": [
1222
+ 750,
1223
+ 1532,
1224
+ 2314,
1225
+ 3096,
1226
+ 3878,
1227
+ 4660,
1228
+ 5442,
1229
+ 6224,
1230
+ 7006,
1231
+ 7788
1232
+ ],
1233
+ "eval_accuracy": [
1234
+ 0.0,
1235
+ 0.0,
1236
+ 0.0,
1237
+ 0.0,
1238
+ 0.0,
1239
+ 0.0,
1240
+ 0.0,
1241
+ 0.0,
1242
+ 0.0,
1243
+ 0.0
1244
+ ]
1245
+ },
1246
+ "final_accuracy": 1.0,
1247
+ "sft_eval": {
1248
+ "config": {
1249
+ "ops": "add_sub",
1250
+ "K": null,
1251
+ "mode": "sft",
1252
+ "n_digits": 6,
1253
+ "n_per_split": 50
1254
+ },
1255
+ "splits": {
1256
+ "add_S0": {
1257
+ "full_accuracy": 1.0,
1258
+ "n_examples": 50,
1259
+ "per_subtask": {
1260
+ "SA": {
1261
+ "accuracy": 1.0,
1262
+ "count": 295
1263
+ },
1264
+ "SS": {
1265
+ "accuracy": 1.0,
1266
+ "count": 55
1267
+ }
1268
+ }
1269
+ },
1270
+ "add_S1": {
1271
+ "full_accuracy": 1.0,
1272
+ "n_examples": 50,
1273
+ "per_subtask": {
1274
+ "SA": {
1275
+ "accuracy": 1.0,
1276
+ "count": 126
1277
+ },
1278
+ "SC": {
1279
+ "accuracy": 1.0,
1280
+ "count": 79
1281
+ },
1282
+ "SS": {
1283
+ "accuracy": 1.0,
1284
+ "count": 21
1285
+ },
1286
+ "UC": {
1287
+ "accuracy": 1.0,
1288
+ "count": 124
1289
+ }
1290
+ }
1291
+ },
1292
+ "add_S2": {
1293
+ "full_accuracy": 1.0,
1294
+ "n_examples": 50,
1295
+ "per_subtask": {
1296
+ "SA": {
1297
+ "accuracy": 1.0,
1298
+ "count": 75
1299
+ },
1300
+ "SC": {
1301
+ "accuracy": 1.0,
1302
+ "count": 62
1303
+ },
1304
+ "SS": {
1305
+ "accuracy": 1.0,
1306
+ "count": 39
1307
+ },
1308
+ "UC": {
1309
+ "accuracy": 1.0,
1310
+ "count": 111
1311
+ },
1312
+ "US": {
1313
+ "accuracy": 1.0,
1314
+ "count": 63
1315
+ }
1316
+ }
1317
+ },
1318
+ "add_S3": {
1319
+ "full_accuracy": 1.0,
1320
+ "n_examples": 50,
1321
+ "per_subtask": {
1322
+ "SA": {
1323
+ "accuracy": 1.0,
1324
+ "count": 60
1325
+ },
1326
+ "SC": {
1327
+ "accuracy": 1.0,
1328
+ "count": 57
1329
+ },
1330
+ "SS": {
1331
+ "accuracy": 1.0,
1332
+ "count": 19
1333
+ },
1334
+ "UC": {
1335
+ "accuracy": 1.0,
1336
+ "count": 104
1337
+ },
1338
+ "US": {
1339
+ "accuracy": 1.0,
1340
+ "count": 110
1341
+ }
1342
+ }
1343
+ },
1344
+ "add_S4": {
1345
+ "full_accuracy": 0.98,
1346
+ "n_examples": 50,
1347
+ "per_subtask": {
1348
+ "SA": {
1349
+ "accuracy": 1.0,
1350
+ "count": 48
1351
+ },
1352
+ "SC": {
1353
+ "accuracy": 1.0,
1354
+ "count": 52
1355
+ },
1356
+ "SS": {
1357
+ "accuracy": 1.0,
1358
+ "count": 7
1359
+ },
1360
+ "UC": {
1361
+ "accuracy": 0.9887640449438202,
1362
+ "count": 89
1363
+ },
1364
+ "US": {
1365
+ "accuracy": 1.0,
1366
+ "count": 154
1367
+ }
1368
+ }
1369
+ },
1370
+ "add_S5": {
1371
+ "full_accuracy": 0.96,
1372
+ "n_examples": 50,
1373
+ "per_subtask": {
1374
+ "SA": {
1375
+ "accuracy": 1.0,
1376
+ "count": 50
1377
+ },
1378
+ "SC": {
1379
+ "accuracy": 1.0,
1380
+ "count": 50
1381
+ },
1382
+ "UC": {
1383
+ "accuracy": 0.96,
1384
+ "count": 50
1385
+ },
1386
+ "US": {
1387
+ "accuracy": 1.0,
1388
+ "count": 200
1389
+ }
1390
+ }
1391
+ },
1392
+ "add_S6": {
1393
+ "full_accuracy": 1.0,
1394
+ "n_examples": 50,
1395
+ "per_subtask": {
1396
+ "SC": {
1397
+ "accuracy": 1.0,
1398
+ "count": 50
1399
+ },
1400
+ "UC": {
1401
+ "accuracy": 1.0,
1402
+ "count": 50
1403
+ },
1404
+ "US": {
1405
+ "accuracy": 1.0,
1406
+ "count": 250
1407
+ }
1408
+ }
1409
+ },
1410
+ "add_random": {
1411
+ "full_accuracy": 1.0,
1412
+ "n_examples": 200,
1413
+ "per_subtask": {
1414
+ "SA": {
1415
+ "accuracy": 1.0,
1416
+ "count": 431
1417
+ },
1418
+ "SC": {
1419
+ "accuracy": 1.0,
1420
+ "count": 316
1421
+ },
1422
+ "SS": {
1423
+ "accuracy": 1.0,
1424
+ "count": 39
1425
+ },
1426
+ "UC": {
1427
+ "accuracy": 1.0,
1428
+ "count": 560
1429
+ },
1430
+ "US": {
1431
+ "accuracy": 1.0,
1432
+ "count": 54
1433
+ }
1434
+ }
1435
+ },
1436
+ "add_C3": {
1437
+ "full_accuracy": 1.0,
1438
+ "n_examples": 50,
1439
+ "per_subtask": {
1440
+ "SA": {
1441
+ "accuracy": 1.0,
1442
+ "count": 150
1443
+ },
1444
+ "SC": {
1445
+ "accuracy": 1.0,
1446
+ "count": 50
1447
+ },
1448
+ "UC": {
1449
+ "accuracy": 1.0,
1450
+ "count": 104
1451
+ },
1452
+ "US": {
1453
+ "accuracy": 1.0,
1454
+ "count": 46
1455
+ }
1456
+ }
1457
+ },
1458
+ "add_C4": {
1459
+ "full_accuracy": 0.96,
1460
+ "n_examples": 50,
1461
+ "per_subtask": {
1462
+ "SA": {
1463
+ "accuracy": 1.0,
1464
+ "count": 100
1465
+ },
1466
+ "SC": {
1467
+ "accuracy": 1.0,
1468
+ "count": 50
1469
+ },
1470
+ "UC": {
1471
+ "accuracy": 0.983739837398374,
1472
+ "count": 123
1473
+ },
1474
+ "US": {
1475
+ "accuracy": 1.0,
1476
+ "count": 77
1477
+ }
1478
+ }
1479
+ },
1480
+ "add_C5": {
1481
+ "full_accuracy": 1.0,
1482
+ "n_examples": 50,
1483
+ "per_subtask": {
1484
+ "SA": {
1485
+ "accuracy": 1.0,
1486
+ "count": 50
1487
+ },
1488
+ "SC": {
1489
+ "accuracy": 1.0,
1490
+ "count": 50
1491
+ },
1492
+ "UC": {
1493
+ "accuracy": 1.0,
1494
+ "count": 154
1495
+ },
1496
+ "US": {
1497
+ "accuracy": 1.0,
1498
+ "count": 96
1499
+ }
1500
+ }
1501
+ },
1502
+ "add_C6": {
1503
+ "full_accuracy": 1.0,
1504
+ "n_examples": 50,
1505
+ "per_subtask": {
1506
+ "SC": {
1507
+ "accuracy": 1.0,
1508
+ "count": 50
1509
+ },
1510
+ "UC": {
1511
+ "accuracy": 1.0,
1512
+ "count": 182
1513
+ },
1514
+ "US": {
1515
+ "accuracy": 1.0,
1516
+ "count": 118
1517
+ }
1518
+ }
1519
+ },
1520
+ "sub_M0": {
1521
+ "full_accuracy": 1.0,
1522
+ "n_examples": 50,
1523
+ "per_subtask": {
1524
+ "MD": {
1525
+ "accuracy": 1.0,
1526
+ "count": 294
1527
+ },
1528
+ "ME": {
1529
+ "accuracy": 1.0,
1530
+ "count": 56
1531
+ }
1532
+ }
1533
+ },
1534
+ "sub_M1": {
1535
+ "full_accuracy": 1.0,
1536
+ "n_examples": 50,
1537
+ "per_subtask": {
1538
+ "MD": {
1539
+ "accuracy": 1.0,
1540
+ "count": 143
1541
+ },
1542
+ "MB": {
1543
+ "accuracy": 1.0,
1544
+ "count": 69
1545
+ },
1546
+ "ME": {
1547
+ "accuracy": 1.0,
1548
+ "count": 15
1549
+ },
1550
+ "UB": {
1551
+ "accuracy": 1.0,
1552
+ "count": 123
1553
+ }
1554
+ }
1555
+ },
1556
+ "sub_M2": {
1557
+ "full_accuracy": 1.0,
1558
+ "n_examples": 50,
1559
+ "per_subtask": {
1560
+ "MD": {
1561
+ "accuracy": 1.0,
1562
+ "count": 108
1563
+ },
1564
+ "MB": {
1565
+ "accuracy": 1.0,
1566
+ "count": 52
1567
+ },
1568
+ "ME": {
1569
+ "accuracy": 1.0,
1570
+ "count": 52
1571
+ },
1572
+ "UB": {
1573
+ "accuracy": 1.0,
1574
+ "count": 87
1575
+ },
1576
+ "UD": {
1577
+ "accuracy": 1.0,
1578
+ "count": 51
1579
+ }
1580
+ }
1581
+ },
1582
+ "sub_M3": {
1583
+ "full_accuracy": 1.0,
1584
+ "n_examples": 50,
1585
+ "per_subtask": {
1586
+ "MD": {
1587
+ "accuracy": 1.0,
1588
+ "count": 94
1589
+ },
1590
+ "MB": {
1591
+ "accuracy": 1.0,
1592
+ "count": 51
1593
+ },
1594
+ "ME": {
1595
+ "accuracy": 1.0,
1596
+ "count": 25
1597
+ },
1598
+ "UB": {
1599
+ "accuracy": 1.0,
1600
+ "count": 78
1601
+ },
1602
+ "UD": {
1603
+ "accuracy": 1.0,
1604
+ "count": 102
1605
+ }
1606
+ }
1607
+ },
1608
+ "sub_M4": {
1609
+ "full_accuracy": 1.0,
1610
+ "n_examples": 50,
1611
+ "per_subtask": {
1612
+ "MD": {
1613
+ "accuracy": 1.0,
1614
+ "count": 100
1615
+ },
1616
+ "MB": {
1617
+ "accuracy": 1.0,
1618
+ "count": 50
1619
+ },
1620
+ "UB": {
1621
+ "accuracy": 1.0,
1622
+ "count": 50
1623
+ },
1624
+ "UD": {
1625
+ "accuracy": 1.0,
1626
+ "count": 150
1627
+ }
1628
+ }
1629
+ },
1630
+ "sub_M5": {
1631
+ "full_accuracy": 1.0,
1632
+ "n_examples": 50,
1633
+ "per_subtask": {
1634
+ "MD": {
1635
+ "accuracy": 1.0,
1636
+ "count": 50
1637
+ },
1638
+ "MB": {
1639
+ "accuracy": 1.0,
1640
+ "count": 50
1641
+ },
1642
+ "UB": {
1643
+ "accuracy": 1.0,
1644
+ "count": 50
1645
+ },
1646
+ "UD": {
1647
+ "accuracy": 1.0,
1648
+ "count": 200
1649
+ }
1650
+ }
1651
+ },
1652
+ "sub_random": {
1653
+ "full_accuracy": 1.0,
1654
+ "n_examples": 200,
1655
+ "per_subtask": {
1656
+ "MD": {
1657
+ "accuracy": 1.0,
1658
+ "count": 588
1659
+ },
1660
+ "MB": {
1661
+ "accuracy": 1.0,
1662
+ "count": 268
1663
+ },
1664
+ "ME": {
1665
+ "accuracy": 1.0,
1666
+ "count": 60
1667
+ },
1668
+ "UB": {
1669
+ "accuracy": 1.0,
1670
+ "count": 447
1671
+ },
1672
+ "UD": {
1673
+ "accuracy": 1.0,
1674
+ "count": 37
1675
+ }
1676
+ }
1677
+ },
1678
+ "sub_B3": {
1679
+ "full_accuracy": 1.0,
1680
+ "n_examples": 50,
1681
+ "per_subtask": {
1682
+ "MD": {
1683
+ "accuracy": 1.0,
1684
+ "count": 150
1685
+ },
1686
+ "MB": {
1687
+ "accuracy": 1.0,
1688
+ "count": 50
1689
+ },
1690
+ "UB": {
1691
+ "accuracy": 1.0,
1692
+ "count": 107
1693
+ },
1694
+ "UD": {
1695
+ "accuracy": 1.0,
1696
+ "count": 43
1697
+ }
1698
+ }
1699
+ },
1700
+ "sub_B4": {
1701
+ "full_accuracy": 1.0,
1702
+ "n_examples": 50,
1703
+ "per_subtask": {
1704
+ "MD": {
1705
+ "accuracy": 1.0,
1706
+ "count": 100
1707
+ },
1708
+ "MB": {
1709
+ "accuracy": 1.0,
1710
+ "count": 50
1711
+ },
1712
+ "UB": {
1713
+ "accuracy": 1.0,
1714
+ "count": 114
1715
+ },
1716
+ "UD": {
1717
+ "accuracy": 1.0,
1718
+ "count": 86
1719
+ }
1720
+ }
1721
+ },
1722
+ "sub_B5": {
1723
+ "full_accuracy": 1.0,
1724
+ "n_examples": 50,
1725
+ "per_subtask": {
1726
+ "MD": {
1727
+ "accuracy": 1.0,
1728
+ "count": 50
1729
+ },
1730
+ "MB": {
1731
+ "accuracy": 1.0,
1732
+ "count": 50
1733
+ },
1734
+ "UB": {
1735
+ "accuracy": 1.0,
1736
+ "count": 153
1737
+ },
1738
+ "UD": {
1739
+ "accuracy": 1.0,
1740
+ "count": 97
1741
+ }
1742
+ }
1743
+ }
1744
+ },
1745
+ "summary": {
1746
+ "overall_accuracy": 0.9964285714285714,
1747
+ "total_examples": 1400,
1748
+ "n_splits": 22
1749
+ }
1750
+ },
1751
+ "sorl_eval": {
1752
+ "config": {
1753
+ "ops": "add_sub",
1754
+ "K": 1,
1755
+ "mode": "sorl",
1756
+ "n_digits": 6,
1757
+ "n_per_split": 50
1758
+ },
1759
+ "splits": {
1760
+ "add_S0": {
1761
+ "full_accuracy": 1.0,
1762
+ "n_examples": 50,
1763
+ "per_subtask": {
1764
+ "SA": {
1765
+ "accuracy": 1.0,
1766
+ "count": 295
1767
+ },
1768
+ "SS": {
1769
+ "accuracy": 1.0,
1770
+ "count": 55
1771
+ }
1772
+ }
1773
+ },
1774
+ "add_S1": {
1775
+ "full_accuracy": 1.0,
1776
+ "n_examples": 50,
1777
+ "per_subtask": {
1778
+ "SA": {
1779
+ "accuracy": 1.0,
1780
+ "count": 126
1781
+ },
1782
+ "SC": {
1783
+ "accuracy": 1.0,
1784
+ "count": 79
1785
+ },
1786
+ "SS": {
1787
+ "accuracy": 1.0,
1788
+ "count": 21
1789
+ },
1790
+ "UC": {
1791
+ "accuracy": 1.0,
1792
+ "count": 124
1793
+ }
1794
+ }
1795
+ },
1796
+ "add_S2": {
1797
+ "full_accuracy": 1.0,
1798
+ "n_examples": 50,
1799
+ "per_subtask": {
1800
+ "SA": {
1801
+ "accuracy": 1.0,
1802
+ "count": 75
1803
+ },
1804
+ "SC": {
1805
+ "accuracy": 1.0,
1806
+ "count": 62
1807
+ },
1808
+ "SS": {
1809
+ "accuracy": 1.0,
1810
+ "count": 39
1811
+ },
1812
+ "UC": {
1813
+ "accuracy": 1.0,
1814
+ "count": 111
1815
+ },
1816
+ "US": {
1817
+ "accuracy": 1.0,
1818
+ "count": 63
1819
+ }
1820
+ }
1821
+ },
1822
+ "add_S3": {
1823
+ "full_accuracy": 1.0,
1824
+ "n_examples": 50,
1825
+ "per_subtask": {
1826
+ "SA": {
1827
+ "accuracy": 1.0,
1828
+ "count": 60
1829
+ },
1830
+ "SC": {
1831
+ "accuracy": 1.0,
1832
+ "count": 57
1833
+ },
1834
+ "SS": {
1835
+ "accuracy": 1.0,
1836
+ "count": 19
1837
+ },
1838
+ "UC": {
1839
+ "accuracy": 1.0,
1840
+ "count": 104
1841
+ },
1842
+ "US": {
1843
+ "accuracy": 1.0,
1844
+ "count": 110
1845
+ }
1846
+ }
1847
+ },
1848
+ "add_S4": {
1849
+ "full_accuracy": 1.0,
1850
+ "n_examples": 50,
1851
+ "per_subtask": {
1852
+ "SA": {
1853
+ "accuracy": 1.0,
1854
+ "count": 48
1855
+ },
1856
+ "SC": {
1857
+ "accuracy": 1.0,
1858
+ "count": 52
1859
+ },
1860
+ "SS": {
1861
+ "accuracy": 1.0,
1862
+ "count": 7
1863
+ },
1864
+ "UC": {
1865
+ "accuracy": 1.0,
1866
+ "count": 89
1867
+ },
1868
+ "US": {
1869
+ "accuracy": 1.0,
1870
+ "count": 154
1871
+ }
1872
+ }
1873
+ },
1874
+ "add_S5": {
1875
+ "full_accuracy": 1.0,
1876
+ "n_examples": 50,
1877
+ "per_subtask": {
1878
+ "SA": {
1879
+ "accuracy": 1.0,
1880
+ "count": 50
1881
+ },
1882
+ "SC": {
1883
+ "accuracy": 1.0,
1884
+ "count": 50
1885
+ },
1886
+ "UC": {
1887
+ "accuracy": 1.0,
1888
+ "count": 50
1889
+ },
1890
+ "US": {
1891
+ "accuracy": 1.0,
1892
+ "count": 200
1893
+ }
1894
+ }
1895
+ },
1896
+ "add_S6": {
1897
+ "full_accuracy": 1.0,
1898
+ "n_examples": 50,
1899
+ "per_subtask": {
1900
+ "SC": {
1901
+ "accuracy": 1.0,
1902
+ "count": 50
1903
+ },
1904
+ "UC": {
1905
+ "accuracy": 1.0,
1906
+ "count": 50
1907
+ },
1908
+ "US": {
1909
+ "accuracy": 1.0,
1910
+ "count": 250
1911
+ }
1912
+ }
1913
+ },
1914
+ "add_random": {
1915
+ "full_accuracy": 1.0,
1916
+ "n_examples": 200,
1917
+ "per_subtask": {
1918
+ "SA": {
1919
+ "accuracy": 1.0,
1920
+ "count": 431
1921
+ },
1922
+ "SC": {
1923
+ "accuracy": 1.0,
1924
+ "count": 316
1925
+ },
1926
+ "SS": {
1927
+ "accuracy": 1.0,
1928
+ "count": 39
1929
+ },
1930
+ "UC": {
1931
+ "accuracy": 1.0,
1932
+ "count": 560
1933
+ },
1934
+ "US": {
1935
+ "accuracy": 1.0,
1936
+ "count": 54
1937
+ }
1938
+ }
1939
+ },
1940
+ "add_C3": {
1941
+ "full_accuracy": 1.0,
1942
+ "n_examples": 50,
1943
+ "per_subtask": {
1944
+ "SA": {
1945
+ "accuracy": 1.0,
1946
+ "count": 150
1947
+ },
1948
+ "SC": {
1949
+ "accuracy": 1.0,
1950
+ "count": 50
1951
+ },
1952
+ "UC": {
1953
+ "accuracy": 1.0,
1954
+ "count": 104
1955
+ },
1956
+ "US": {
1957
+ "accuracy": 1.0,
1958
+ "count": 46
1959
+ }
1960
+ }
1961
+ },
1962
+ "add_C4": {
1963
+ "full_accuracy": 1.0,
1964
+ "n_examples": 50,
1965
+ "per_subtask": {
1966
+ "SA": {
1967
+ "accuracy": 1.0,
1968
+ "count": 100
1969
+ },
1970
+ "SC": {
1971
+ "accuracy": 1.0,
1972
+ "count": 50
1973
+ },
1974
+ "UC": {
1975
+ "accuracy": 1.0,
1976
+ "count": 123
1977
+ },
1978
+ "US": {
1979
+ "accuracy": 1.0,
1980
+ "count": 77
1981
+ }
1982
+ }
1983
+ },
1984
+ "add_C5": {
1985
+ "full_accuracy": 1.0,
1986
+ "n_examples": 50,
1987
+ "per_subtask": {
1988
+ "SA": {
1989
+ "accuracy": 1.0,
1990
+ "count": 50
1991
+ },
1992
+ "SC": {
1993
+ "accuracy": 1.0,
1994
+ "count": 50
1995
+ },
1996
+ "UC": {
1997
+ "accuracy": 1.0,
1998
+ "count": 154
1999
+ },
2000
+ "US": {
2001
+ "accuracy": 1.0,
2002
+ "count": 96
2003
+ }
2004
+ }
2005
+ },
2006
+ "add_C6": {
2007
+ "full_accuracy": 1.0,
2008
+ "n_examples": 50,
2009
+ "per_subtask": {
2010
+ "SC": {
2011
+ "accuracy": 1.0,
2012
+ "count": 50
2013
+ },
2014
+ "UC": {
2015
+ "accuracy": 1.0,
2016
+ "count": 182
2017
+ },
2018
+ "US": {
2019
+ "accuracy": 1.0,
2020
+ "count": 118
2021
+ }
2022
+ }
2023
+ },
2024
+ "sub_M0": {
2025
+ "full_accuracy": 1.0,
2026
+ "n_examples": 50,
2027
+ "per_subtask": {
2028
+ "MD": {
2029
+ "accuracy": 1.0,
2030
+ "count": 294
2031
+ },
2032
+ "ME": {
2033
+ "accuracy": 1.0,
2034
+ "count": 56
2035
+ }
2036
+ }
2037
+ },
2038
+ "sub_M1": {
2039
+ "full_accuracy": 1.0,
2040
+ "n_examples": 50,
2041
+ "per_subtask": {
2042
+ "MD": {
2043
+ "accuracy": 1.0,
2044
+ "count": 143
2045
+ },
2046
+ "MB": {
2047
+ "accuracy": 1.0,
2048
+ "count": 69
2049
+ },
2050
+ "ME": {
2051
+ "accuracy": 1.0,
2052
+ "count": 15
2053
+ },
2054
+ "UB": {
2055
+ "accuracy": 1.0,
2056
+ "count": 123
2057
+ }
2058
+ }
2059
+ },
2060
+ "sub_M2": {
2061
+ "full_accuracy": 1.0,
2062
+ "n_examples": 50,
2063
+ "per_subtask": {
2064
+ "MD": {
2065
+ "accuracy": 1.0,
2066
+ "count": 108
2067
+ },
2068
+ "MB": {
2069
+ "accuracy": 1.0,
2070
+ "count": 52
2071
+ },
2072
+ "ME": {
2073
+ "accuracy": 1.0,
2074
+ "count": 52
2075
+ },
2076
+ "UB": {
2077
+ "accuracy": 1.0,
2078
+ "count": 87
2079
+ },
2080
+ "UD": {
2081
+ "accuracy": 1.0,
2082
+ "count": 51
2083
+ }
2084
+ }
2085
+ },
2086
+ "sub_M3": {
2087
+ "full_accuracy": 1.0,
2088
+ "n_examples": 50,
2089
+ "per_subtask": {
2090
+ "MD": {
2091
+ "accuracy": 1.0,
2092
+ "count": 94
2093
+ },
2094
+ "MB": {
2095
+ "accuracy": 1.0,
2096
+ "count": 51
2097
+ },
2098
+ "ME": {
2099
+ "accuracy": 1.0,
2100
+ "count": 25
2101
+ },
2102
+ "UB": {
2103
+ "accuracy": 1.0,
2104
+ "count": 78
2105
+ },
2106
+ "UD": {
2107
+ "accuracy": 1.0,
2108
+ "count": 102
2109
+ }
2110
+ }
2111
+ },
2112
+ "sub_M4": {
2113
+ "full_accuracy": 1.0,
2114
+ "n_examples": 50,
2115
+ "per_subtask": {
2116
+ "MD": {
2117
+ "accuracy": 1.0,
2118
+ "count": 100
2119
+ },
2120
+ "MB": {
2121
+ "accuracy": 1.0,
2122
+ "count": 50
2123
+ },
2124
+ "UB": {
2125
+ "accuracy": 1.0,
2126
+ "count": 50
2127
+ },
2128
+ "UD": {
2129
+ "accuracy": 1.0,
2130
+ "count": 150
2131
+ }
2132
+ }
2133
+ },
2134
+ "sub_M5": {
2135
+ "full_accuracy": 1.0,
2136
+ "n_examples": 50,
2137
+ "per_subtask": {
2138
+ "MD": {
2139
+ "accuracy": 1.0,
2140
+ "count": 50
2141
+ },
2142
+ "MB": {
2143
+ "accuracy": 1.0,
2144
+ "count": 50
2145
+ },
2146
+ "UB": {
2147
+ "accuracy": 1.0,
2148
+ "count": 50
2149
+ },
2150
+ "UD": {
2151
+ "accuracy": 1.0,
2152
+ "count": 200
2153
+ }
2154
+ }
2155
+ },
2156
+ "sub_random": {
2157
+ "full_accuracy": 1.0,
2158
+ "n_examples": 200,
2159
+ "per_subtask": {
2160
+ "MD": {
2161
+ "accuracy": 1.0,
2162
+ "count": 588
2163
+ },
2164
+ "MB": {
2165
+ "accuracy": 1.0,
2166
+ "count": 268
2167
+ },
2168
+ "ME": {
2169
+ "accuracy": 1.0,
2170
+ "count": 60
2171
+ },
2172
+ "UB": {
2173
+ "accuracy": 1.0,
2174
+ "count": 447
2175
+ },
2176
+ "UD": {
2177
+ "accuracy": 1.0,
2178
+ "count": 37
2179
+ }
2180
+ }
2181
+ },
2182
+ "sub_B3": {
2183
+ "full_accuracy": 1.0,
2184
+ "n_examples": 50,
2185
+ "per_subtask": {
2186
+ "MD": {
2187
+ "accuracy": 1.0,
2188
+ "count": 150
2189
+ },
2190
+ "MB": {
2191
+ "accuracy": 1.0,
2192
+ "count": 50
2193
+ },
2194
+ "UB": {
2195
+ "accuracy": 1.0,
2196
+ "count": 107
2197
+ },
2198
+ "UD": {
2199
+ "accuracy": 1.0,
2200
+ "count": 43
2201
+ }
2202
+ }
2203
+ },
2204
+ "sub_B4": {
2205
+ "full_accuracy": 1.0,
2206
+ "n_examples": 50,
2207
+ "per_subtask": {
2208
+ "MD": {
2209
+ "accuracy": 1.0,
2210
+ "count": 100
2211
+ },
2212
+ "MB": {
2213
+ "accuracy": 1.0,
2214
+ "count": 50
2215
+ },
2216
+ "UB": {
2217
+ "accuracy": 1.0,
2218
+ "count": 114
2219
+ },
2220
+ "UD": {
2221
+ "accuracy": 1.0,
2222
+ "count": 86
2223
+ }
2224
+ }
2225
+ },
2226
+ "sub_B5": {
2227
+ "full_accuracy": 1.0,
2228
+ "n_examples": 50,
2229
+ "per_subtask": {
2230
+ "MD": {
2231
+ "accuracy": 1.0,
2232
+ "count": 50
2233
+ },
2234
+ "MB": {
2235
+ "accuracy": 1.0,
2236
+ "count": 50
2237
+ },
2238
+ "UB": {
2239
+ "accuracy": 1.0,
2240
+ "count": 153
2241
+ },
2242
+ "UD": {
2243
+ "accuracy": 1.0,
2244
+ "count": 97
2245
+ }
2246
+ }
2247
+ }
2248
+ },
2249
+ "summary": {
2250
+ "overall_accuracy": 1.0,
2251
+ "total_examples": 1400,
2252
+ "n_splits": 22
2253
+ }
2254
+ },
2255
+ "sorl_overall_accuracy": 1.0,
2256
+ "sft_overall_accuracy": 0.9964285714285714
2257
+ }
add_sub_sorl_v1_abs10_K1_50K/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4dcf90a20c26c5b68f843a61059999485f735079999d942153afd32e303fdb71
3
+ size 650303660
add_sub_sorl_v1_abs10_K1_50K/train_config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "mode": "sorl",
3
+ "ops": "add_sub",
4
+ "n_digits": 6,
5
+ "n_layer": 2,
6
+ "n_head": 3,
7
+ "n_embd": 510,
8
+ "abs_vocab": 10,
9
+ "K": 1,
10
+ "alpha_info_gain": 10.0,
11
+ "alpha_abs": 0.1,
12
+ "alpha_soft_zipf": 1.0,
13
+ "batch_size": 64,
14
+ "num_epochs": 10,
15
+ "dataset_size": 50000,
16
+ "lr": 8e-05,
17
+ "output_dir": "ckpt/sweep/as_sorl_abs10_K1_50K",
18
+ "device": "cuda",
19
+ "push_to_hub": true,
20
+ "no_wandb": false,
21
+ "n_params": 162499262,
22
+ "run_name": "add_sub_sorl_v1_abs10_K1_50K",
23
+ "git_commit": "800625019270114adcda289bbd550c4f1109a514",
24
+ "timestamp": "2026-04-12T01:58:11.881229+00:00",
25
+ "tokenizer": "Qwen/Qwen3-0.6B",
26
+ "dataset_repo": "thoughtworks/arithmetic-sorl-data",
27
+ "dataset_config": "add_sub_6digit",
28
+ "model_repo": "thoughtworks/arithmetic-sorl",
29
+ "trainer_version": "v1",
30
+ "wandb_run_id": "yb0fucvp",
31
+ "wandb_url": "https://wandb.ai/nlp_and_interpretability/sorl-arithmetic/runs/yb0fucvp",
32
+ "final_accuracy": 1.0,
33
+ "sft_accuracy": 0.9964285714285714,
34
+ "eval_method": "ArithmeticEvaluator"
35
+ }