iamshlomo commited on
Commit
2433c6f
·
verified ·
1 Parent(s): 0e027fa

Upload generalization/full_analysis_20260324_1808/minihack_full_results.json with huggingface_hub

Browse files
generalization/full_analysis_20260324_1808/minihack_full_results.json ADDED
@@ -0,0 +1,2341 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "run_id": "20260324_1808",
3
+ "max_iter": 500,
4
+ "eval_eps": 10,
5
+ "id_envs": [
6
+ "MiniHack-Room-Random-5x5-v0",
7
+ "MiniHack-Room-Random-15x15-v0",
8
+ "MiniHack-Corridor-R2-v0",
9
+ "MiniHack-MazeWalk-9x9-v0"
10
+ ],
11
+ "ood_envs": [
12
+ "MiniHack-Room-Dark-15x15-v0",
13
+ "MiniHack-Corridor-R5-v0",
14
+ "MiniHack-MazeWalk-45x19-v0"
15
+ ],
16
+ "baseline_id": 0.525,
17
+ "baseline_ood": 0.09999999999999999,
18
+ "baseline_rl_id": 0.07500000000000001,
19
+ "baseline_rl_ood": 0.03333333333333333,
20
+ "kl_penalty": {
21
+ "id": 0.19999999999999998,
22
+ "ood": 0.06666666666666667
23
+ },
24
+ "frozen_backbone": {
25
+ "id": 0.07500000000000001,
26
+ "ood": 0.0
27
+ },
28
+ "bc_on_wins": {
29
+ "id": 0.0,
30
+ "ood": 0.0
31
+ },
32
+ "low_t_only": {
33
+ "id": 0.1,
34
+ "ood": 0.03333333333333333
35
+ },
36
+ "synthetic": {
37
+ "id": 0.05
38
+ },
39
+ "histories": {
40
+ "baseline_rl": {
41
+ "iter": [
42
+ 10,
43
+ 20,
44
+ 30,
45
+ 40,
46
+ 50,
47
+ 60,
48
+ 70,
49
+ 80,
50
+ 90,
51
+ 100,
52
+ 110,
53
+ 120,
54
+ 130,
55
+ 140,
56
+ 150,
57
+ 160,
58
+ 170,
59
+ 180,
60
+ 190,
61
+ 200,
62
+ 210,
63
+ 220,
64
+ 230,
65
+ 240,
66
+ 250,
67
+ 260,
68
+ 270,
69
+ 280,
70
+ 290,
71
+ 300,
72
+ 310,
73
+ 320,
74
+ 330,
75
+ 340,
76
+ 350,
77
+ 360,
78
+ 370,
79
+ 380,
80
+ 390,
81
+ 400,
82
+ 410,
83
+ 420,
84
+ 430,
85
+ 440,
86
+ 450,
87
+ 460,
88
+ 470,
89
+ 480,
90
+ 490,
91
+ 500
92
+ ],
93
+ "loss": [
94
+ 0.6148233324289322,
95
+ 0.3809540167450905,
96
+ 0.2913197621703148,
97
+ 0.26417313516139984,
98
+ 0.11857011616230011,
99
+ 0.0840552069246769,
100
+ 0.08531304597854614,
101
+ 0.1835819397121668,
102
+ 0.20300053358078002,
103
+ 0.12927267588675023,
104
+ 0.11205883212387562,
105
+ 0.17100902125239373,
106
+ 0.20254286825656892,
107
+ 0.107823496311903,
108
+ 0.07153078094124794,
109
+ 0.1800502438098192,
110
+ 0.1361511755734682,
111
+ 0.09857502207159996,
112
+ 0.10814670212566853,
113
+ 0.13385814875364305,
114
+ 0.14108883105218412,
115
+ 0.12549171056598424,
116
+ 0.1697872318327427,
117
+ 0.07361965905874968,
118
+ 0.11370712220668792,
119
+ 0.18413102626800537,
120
+ 0.1451867740601301,
121
+ 0.2053861353546381,
122
+ 0.12562402486801147,
123
+ 0.09759063683450223,
124
+ 0.25905307196080685,
125
+ 0.09509452991187572,
126
+ 0.10611449293792248,
127
+ 0.12874542437493802,
128
+ 0.11595193743705749,
129
+ 0.28429544251412153,
130
+ 0.06217188648879528,
131
+ 0.11267267782241105,
132
+ 0.1121125804260373,
133
+ 0.3091689977794886,
134
+ 0.19386135265231133,
135
+ 0.18517458215355873,
136
+ 0.059556832257658246,
137
+ 0.17544590644538402,
138
+ 0.15815371368080378,
139
+ 0.07657972946763039,
140
+ 0.08935816995799542,
141
+ 0.3414760454557836,
142
+ 0.060135122109204533,
143
+ 0.09533649031072855
144
+ ],
145
+ "mean_return": [
146
+ 9.893000000000004,
147
+ 1.388000000000008,
148
+ -3.989999999999989,
149
+ 1.1050000000000064,
150
+ 4.343000000000001,
151
+ 0.9850000000000115,
152
+ -0.4339999999999934,
153
+ 3.557000000000005,
154
+ 0.7410000000000121,
155
+ -1.7819999999999883,
156
+ -2.323999999999984,
157
+ 1.8760000000000054,
158
+ -3.9299999999999864,
159
+ -2.0499999999999856,
160
+ -4.01099999999999,
161
+ -2.780999999999999,
162
+ -3.144999999999993,
163
+ -4.459999999999985,
164
+ -1.572999999999987,
165
+ -4.6479999999999855,
166
+ -2.8879999999999897,
167
+ 1.8810000000000096,
168
+ -2.045999999999987,
169
+ -2.979999999999994,
170
+ 2.3930000000000016,
171
+ 1.2220000000000069,
172
+ -4.036999999999989,
173
+ -3.680999999999988,
174
+ -0.7959999999999952,
175
+ -3.362999999999991,
176
+ 0.8120000000000054,
177
+ -4.1609999999999925,
178
+ -2.8989999999999982,
179
+ -0.6399999999999968,
180
+ -4.019999999999989,
181
+ 0.17200000000000076,
182
+ -3.0829999999999975,
183
+ -0.9379999999999971,
184
+ 0.14200000000000973,
185
+ -1.6859999999999886,
186
+ -3.205999999999995,
187
+ -3.496999999999994,
188
+ 0.3280000000000037,
189
+ 0.3240000000000072,
190
+ 3.0590000000000095,
191
+ -3.0499999999999914,
192
+ -0.6909999999999927,
193
+ -3.3699999999999988,
194
+ -0.16999999999999393,
195
+ -3.383999999999996
196
+ ],
197
+ "win_rate_episode": [
198
+ 0.5,
199
+ 0.2,
200
+ 0.0,
201
+ 0.2,
202
+ 0.3,
203
+ 0.2,
204
+ 0.1,
205
+ 0.3,
206
+ 0.2,
207
+ 0.1,
208
+ 0.1,
209
+ 0.2,
210
+ 0.0,
211
+ 0.1,
212
+ 0.0,
213
+ 0.0,
214
+ 0.0,
215
+ 0.0,
216
+ 0.1,
217
+ 0.0,
218
+ 0.0,
219
+ 0.2,
220
+ 0.0,
221
+ 0.0,
222
+ 0.2,
223
+ 0.2,
224
+ 0.0,
225
+ 0.0,
226
+ 0.1,
227
+ 0.0,
228
+ 0.2,
229
+ 0.0,
230
+ 0.0,
231
+ 0.1,
232
+ 0.0,
233
+ 0.1,
234
+ 0.0,
235
+ 0.1,
236
+ 0.1,
237
+ 0.1,
238
+ 0.0,
239
+ 0.0,
240
+ 0.1,
241
+ 0.1,
242
+ 0.3,
243
+ 0.0,
244
+ 0.1,
245
+ 0.0,
246
+ 0.1,
247
+ 0.0
248
+ ],
249
+ "id_winrate": [
250
+ 0.07500000000000001,
251
+ 0.025,
252
+ 0.15000000000000002,
253
+ 0.025,
254
+ 0.05
255
+ ],
256
+ "id_winrate_iter": [
257
+ 100,
258
+ 200,
259
+ 300,
260
+ 400,
261
+ 500
262
+ ],
263
+ "ood_winrate": [
264
+ 0.0
265
+ ],
266
+ "ood_winrate_iter": [
267
+ 500
268
+ ],
269
+ "grad_align_iter": [
270
+ 25,
271
+ 50,
272
+ 75,
273
+ 100,
274
+ 125,
275
+ 150,
276
+ 175,
277
+ 200,
278
+ 225,
279
+ 250,
280
+ 275,
281
+ 300,
282
+ 325,
283
+ 350,
284
+ 375,
285
+ 400,
286
+ 425,
287
+ 450,
288
+ 475,
289
+ 500
290
+ ],
291
+ "grad_align": [
292
+ -0.058208562433719635,
293
+ -0.0651715025305748,
294
+ 0.02432907372713089,
295
+ -0.11967084556818008,
296
+ -0.06396359205245972,
297
+ 0.05138474702835083,
298
+ 0.06564196199178696,
299
+ -0.056323569267988205,
300
+ -0.12439217418432236,
301
+ 0.06368804723024368,
302
+ 0.03813093900680542,
303
+ -0.0025425786152482033,
304
+ 0.07177173346281052,
305
+ 0.07621006667613983,
306
+ 0.08873365074396133,
307
+ 0.06467217952013016,
308
+ 0.10458952188491821,
309
+ -0.0745132565498352,
310
+ -0.043301280587911606,
311
+ -0.054070524871349335
312
+ ],
313
+ "rl_grad_norm": [
314
+ 5.1438727378845215,
315
+ 1.8353410959243774,
316
+ 2.5503463745117188,
317
+ 3.223416805267334,
318
+ 4.39517068862915,
319
+ 1.2743784189224243,
320
+ 4.085864543914795,
321
+ 5.110538959503174,
322
+ 0.8212806582450867,
323
+ 0.5378777980804443,
324
+ 4.511179447174072,
325
+ 3.9516587257385254,
326
+ 0.2867647707462311,
327
+ 0.440229207277298,
328
+ 4.9942240715026855,
329
+ 0.40748071670532227,
330
+ 0.30211448669433594,
331
+ 0.29640454053878784,
332
+ 4.158074378967285,
333
+ 0.4770386517047882
334
+ ],
335
+ "bc_grad_norm": [
336
+ 42.94075393676758,
337
+ 36.67488479614258,
338
+ 23.443866729736328,
339
+ 31.622037887573242,
340
+ 31.358760833740234,
341
+ 29.620054244995117,
342
+ 33.03789520263672,
343
+ 30.791826248168945,
344
+ 30.711111068725586,
345
+ 31.014888763427734,
346
+ 28.969528198242188,
347
+ 36.763179779052734,
348
+ 40.4757080078125,
349
+ 27.80027198791504,
350
+ 35.19121551513672,
351
+ 23.520423889160156,
352
+ 31.927614212036133,
353
+ 28.92644500732422,
354
+ 25.52045249938965,
355
+ 29.165616989135742
356
+ ],
357
+ "repr_drift_iter": [
358
+ 25,
359
+ 50,
360
+ 75,
361
+ 100,
362
+ 125,
363
+ 150,
364
+ 175,
365
+ 200,
366
+ 225,
367
+ 250,
368
+ 275,
369
+ 300,
370
+ 325,
371
+ 350,
372
+ 375,
373
+ 400,
374
+ 425,
375
+ 450,
376
+ 475,
377
+ 500
378
+ ],
379
+ "repr_drift": [
380
+ 3.9174275398254395,
381
+ 3.921295404434204,
382
+ 3.289928913116455,
383
+ 3.432715892791748,
384
+ 4.955285549163818,
385
+ 6.035404682159424,
386
+ 3.7579100131988525,
387
+ 6.952962875366211,
388
+ 4.759391784667969,
389
+ 5.946002960205078,
390
+ 5.042165279388428,
391
+ 5.096562385559082,
392
+ 7.713111877441406,
393
+ 5.057335376739502,
394
+ 5.915785312652588,
395
+ 6.065906524658203,
396
+ 6.096728324890137,
397
+ 5.633598804473877,
398
+ 7.825881004333496,
399
+ 5.941793441772461
400
+ ],
401
+ "t_analysis_iter": [
402
+ 25,
403
+ 50,
404
+ 75,
405
+ 100,
406
+ 125,
407
+ 150,
408
+ 175,
409
+ 200,
410
+ 225,
411
+ 250,
412
+ 275,
413
+ 300,
414
+ 325,
415
+ 350,
416
+ 375,
417
+ 400,
418
+ 425,
419
+ 450,
420
+ 475,
421
+ 500
422
+ ],
423
+ "norm_low_t": [
424
+ 0.6042988896369934,
425
+ 1.0757032632827759,
426
+ 0.7331315279006958,
427
+ 0.24747660756111145,
428
+ 2.080561637878418,
429
+ 0.9171179533004761,
430
+ 1.712111234664917,
431
+ 0.29089364409446716,
432
+ 1.380139946937561,
433
+ 0.2912912964820862,
434
+ 9.737605094909668,
435
+ 1.1511173248291016,
436
+ 0.18356630206108093,
437
+ 0.20324313640594482,
438
+ 0.17540384829044342,
439
+ 0.42038843035697937,
440
+ 0.17512419819831848,
441
+ 0.2651197910308838,
442
+ 1.116037368774414,
443
+ 0.371931254863739
444
+ ],
445
+ "norm_high_t": [
446
+ 4.477694511413574,
447
+ 1.4458259344100952,
448
+ 1.6219706535339355,
449
+ 2.7416775226593018,
450
+ 8.002521514892578,
451
+ 0.4847201108932495,
452
+ 2.993985891342163,
453
+ 5.996467590332031,
454
+ 1.0836647748947144,
455
+ 2.2951300144195557,
456
+ 3.4846208095550537,
457
+ 2.6846556663513184,
458
+ 3.8576018810272217,
459
+ 3.7125260829925537,
460
+ 8.503217697143555,
461
+ 0.315531849861145,
462
+ 0.9193480014801025,
463
+ 2.7490694522857666,
464
+ 3.2481539249420166,
465
+ 0.9018426537513733
466
+ ],
467
+ "lowhigh_cos": [
468
+ -0.03117767721414566,
469
+ 0.3306260108947754,
470
+ -0.01929948851466179,
471
+ -0.03247307986021042,
472
+ 0.31372562050819397,
473
+ 0.538202702999115,
474
+ 0.4733111262321472,
475
+ 0.05152900516986847,
476
+ 0.03016793541610241,
477
+ 0.27196091413497925,
478
+ -0.13637898862361908,
479
+ 0.352779746055603,
480
+ 0.3681124448776245,
481
+ -0.21593783795833588,
482
+ 0.14504966139793396,
483
+ 0.08981875330209732,
484
+ -0.0734558179974556,
485
+ -0.16903267800807953,
486
+ -0.04215971753001213,
487
+ 0.3218880891799927
488
+ ]
489
+ },
490
+ "kl_penalty": {
491
+ "iter": [
492
+ 10,
493
+ 20,
494
+ 30,
495
+ 40,
496
+ 50,
497
+ 60,
498
+ 70,
499
+ 80,
500
+ 90,
501
+ 100,
502
+ 110,
503
+ 120,
504
+ 130,
505
+ 140,
506
+ 150,
507
+ 160,
508
+ 170,
509
+ 180,
510
+ 190,
511
+ 200,
512
+ 210,
513
+ 220,
514
+ 230,
515
+ 240,
516
+ 250,
517
+ 260,
518
+ 270,
519
+ 280,
520
+ 290,
521
+ 300,
522
+ 310,
523
+ 320,
524
+ 330,
525
+ 340,
526
+ 350,
527
+ 360,
528
+ 370,
529
+ 380,
530
+ 390,
531
+ 400,
532
+ 410,
533
+ 420,
534
+ 430,
535
+ 440,
536
+ 450,
537
+ 460,
538
+ 470,
539
+ 480,
540
+ 490,
541
+ 500
542
+ ],
543
+ "loss": [
544
+ 0.47720463275909425,
545
+ 0.3722280740737915,
546
+ 0.5989133432507515,
547
+ 0.29219049513339995,
548
+ 0.2222285270690918,
549
+ 0.208314248919487,
550
+ 0.18497198522090913,
551
+ 0.246753990650177,
552
+ 0.167646624147892,
553
+ 0.3369371652603149,
554
+ 0.24976162165403365,
555
+ 0.2363536536693573,
556
+ 0.2619509071111679,
557
+ 0.2550868228077888,
558
+ 0.1868986152112484,
559
+ 0.2387184202671051,
560
+ 0.19962314739823342,
561
+ 0.202347831428051,
562
+ 0.20775072425603866,
563
+ 0.17277054935693742,
564
+ 0.18261205181479453,
565
+ 0.23156674206256866,
566
+ 0.1980122596025467,
567
+ 0.31983775198459624,
568
+ 0.3451811149716377,
569
+ 0.3667415648698807,
570
+ 0.23837146311998367,
571
+ 0.25584650188684466,
572
+ 0.27007018625736234,
573
+ 0.36001030057668687,
574
+ 0.24520354121923446,
575
+ 0.24931957870721816,
576
+ 0.3233412757515907,
577
+ 0.3413666620850563,
578
+ 0.3326790541410446,
579
+ 0.2901922330260277,
580
+ 0.3419711276888847,
581
+ 0.2629153922200203,
582
+ 0.2039576329290867,
583
+ 0.24899442493915558,
584
+ 0.2017705723643303,
585
+ 0.2602558508515358,
586
+ 0.27445500791072847,
587
+ 0.3583936542272568,
588
+ 0.3106272265315056,
589
+ 0.2072318471968174,
590
+ 0.2656280748546124,
591
+ 0.18286765292286872,
592
+ 0.2839070238173008,
593
+ 0.19765938594937324
594
+ ],
595
+ "mean_return": [
596
+ 6.715000000000005,
597
+ 7.0890000000000075,
598
+ 6.662000000000008,
599
+ 6.703000000000005,
600
+ 9.288,
601
+ 4.207000000000005,
602
+ 7.5290000000000035,
603
+ -0.5189999999999957,
604
+ -0.7019999999999911,
605
+ 1.269000000000014,
606
+ 6.418000000000011,
607
+ 9.353000000000005,
608
+ 14.163000000000007,
609
+ 7.142000000000001,
610
+ 4.8839999999999995,
611
+ 5.996000000000007,
612
+ -1.004999999999994,
613
+ 9.264000000000005,
614
+ 9.412000000000004,
615
+ 8.656000000000006,
616
+ 1.6050000000000124,
617
+ -1.3469999999999982,
618
+ -0.1899999999999937,
619
+ 6.854000000000002,
620
+ 6.399000000000006,
621
+ 1.1550000000000071,
622
+ 2.0970000000000093,
623
+ 1.3930000000000053,
624
+ 5.5690000000000115,
625
+ 9.957999999999998,
626
+ 3.869000000000013,
627
+ 1.5870000000000029,
628
+ 5.257000000000018,
629
+ 2.4680000000000035,
630
+ 2.103000000000004,
631
+ -0.332999999999992,
632
+ 3.5430000000000126,
633
+ -1.2479999999999893,
634
+ 4.84700000000001,
635
+ 9.819000000000003,
636
+ 6.772999999999999,
637
+ 7.122,
638
+ 7.764000000000005,
639
+ -2.622000000000001,
640
+ 9.799,
641
+ 3.420000000000015,
642
+ 2.4489999999999985,
643
+ 1.5940000000000056,
644
+ 11.725000000000003,
645
+ 6.8199999999999985
646
+ ],
647
+ "win_rate_episode": [
648
+ 0.4,
649
+ 0.4,
650
+ 0.4,
651
+ 0.4,
652
+ 0.5,
653
+ 0.3,
654
+ 0.4,
655
+ 0.1,
656
+ 0.1,
657
+ 0.2,
658
+ 0.4,
659
+ 0.5,
660
+ 0.7,
661
+ 0.4,
662
+ 0.3,
663
+ 0.4,
664
+ 0.1,
665
+ 0.5,
666
+ 0.5,
667
+ 0.5,
668
+ 0.2,
669
+ 0.1,
670
+ 0.1,
671
+ 0.4,
672
+ 0.4,
673
+ 0.2,
674
+ 0.2,
675
+ 0.2,
676
+ 0.3,
677
+ 0.5,
678
+ 0.3,
679
+ 0.2,
680
+ 0.4,
681
+ 0.2,
682
+ 0.2,
683
+ 0.1,
684
+ 0.3,
685
+ 0.1,
686
+ 0.3,
687
+ 0.5,
688
+ 0.4,
689
+ 0.4,
690
+ 0.4,
691
+ 0.0,
692
+ 0.5,
693
+ 0.3,
694
+ 0.2,
695
+ 0.2,
696
+ 0.6,
697
+ 0.4
698
+ ],
699
+ "id_winrate": [
700
+ 0.275,
701
+ 0.35,
702
+ 0.32499999999999996,
703
+ 0.47500000000000003,
704
+ 0.275
705
+ ],
706
+ "id_winrate_iter": [
707
+ 100,
708
+ 200,
709
+ 300,
710
+ 400,
711
+ 500
712
+ ],
713
+ "ood_winrate": [
714
+ 0.03333333333333333
715
+ ],
716
+ "ood_winrate_iter": [
717
+ 500
718
+ ],
719
+ "grad_align_iter": [
720
+ 25,
721
+ 50,
722
+ 75,
723
+ 100,
724
+ 125,
725
+ 150,
726
+ 175,
727
+ 200,
728
+ 225,
729
+ 250,
730
+ 275,
731
+ 300,
732
+ 325,
733
+ 350,
734
+ 375,
735
+ 400,
736
+ 425,
737
+ 450,
738
+ 475,
739
+ 500
740
+ ],
741
+ "grad_align": [
742
+ 0.12577727437019348,
743
+ 0.475358784198761,
744
+ 0.1447940617799759,
745
+ 0.2517915964126587,
746
+ 0.24051342904567719,
747
+ 0.12343470752239227,
748
+ 0.26251906156539917,
749
+ 0.4390185475349426,
750
+ 0.3158450424671173,
751
+ 0.1859588623046875,
752
+ 0.1288192719221115,
753
+ 0.16457483172416687,
754
+ 0.06687517464160919,
755
+ 0.28203508257865906,
756
+ -0.0643845871090889,
757
+ 0.43475744128227234,
758
+ 0.0821591168642044,
759
+ 0.15027153491973877,
760
+ 0.2616863250732422,
761
+ 0.06052985042333603
762
+ ],
763
+ "rl_grad_norm": [
764
+ 1.6484646797180176,
765
+ 0.7641434669494629,
766
+ 0.3919910192489624,
767
+ 0.5171300172805786,
768
+ 0.9273585081100464,
769
+ 5.108277797698975,
770
+ 0.3941856324672699,
771
+ 0.6915857195854187,
772
+ 1.3585087060928345,
773
+ 5.926978588104248,
774
+ 1.4570164680480957,
775
+ 5.051280975341797,
776
+ 5.409900665283203,
777
+ 4.84480619430542,
778
+ 4.572493076324463,
779
+ 0.41527414321899414,
780
+ 1.148199200630188,
781
+ 3.8179819583892822,
782
+ 1.388190507888794,
783
+ 1.8957138061523438
784
+ ],
785
+ "bc_grad_norm": [
786
+ 21.5945987701416,
787
+ 32.32539749145508,
788
+ 16.469274520874023,
789
+ 20.107419967651367,
790
+ 37.188690185546875,
791
+ 24.627458572387695,
792
+ 22.156517028808594,
793
+ 27.076255798339844,
794
+ 19.586650848388672,
795
+ 36.89309310913086,
796
+ 30.176376342773438,
797
+ 32.40763854980469,
798
+ 27.814945220947266,
799
+ 23.15130615234375,
800
+ 19.543317794799805,
801
+ 15.00487995147705,
802
+ 24.16813850402832,
803
+ 21.311819076538086,
804
+ 27.31245994567871,
805
+ 24.81256866455078
806
+ ],
807
+ "repr_drift_iter": [
808
+ 25,
809
+ 50,
810
+ 75,
811
+ 100,
812
+ 125,
813
+ 150,
814
+ 175,
815
+ 200,
816
+ 225,
817
+ 250,
818
+ 275,
819
+ 300,
820
+ 325,
821
+ 350,
822
+ 375,
823
+ 400,
824
+ 425,
825
+ 450,
826
+ 475,
827
+ 500
828
+ ],
829
+ "repr_drift": [
830
+ 0.7076154947280884,
831
+ 0.8270145058631897,
832
+ 0.6357962489128113,
833
+ 0.9219959378242493,
834
+ 0.9769991636276245,
835
+ 0.7882189750671387,
836
+ 0.4730985462665558,
837
+ 0.7068943381309509,
838
+ 0.7026294469833374,
839
+ 0.8599920868873596,
840
+ 0.68658846616745,
841
+ 1.1955540180206299,
842
+ 0.8948519825935364,
843
+ 0.7279329299926758,
844
+ 0.5875495076179504,
845
+ 0.8631494641304016,
846
+ 0.6803992986679077,
847
+ 0.5603808164596558,
848
+ 1.1027553081512451,
849
+ 0.5930135250091553
850
+ ],
851
+ "t_analysis_iter": [
852
+ 25,
853
+ 50,
854
+ 75,
855
+ 100,
856
+ 125,
857
+ 150,
858
+ 175,
859
+ 200,
860
+ 225,
861
+ 250,
862
+ 275,
863
+ 300,
864
+ 325,
865
+ 350,
866
+ 375,
867
+ 400,
868
+ 425,
869
+ 450,
870
+ 475,
871
+ 500
872
+ ],
873
+ "norm_low_t": [
874
+ 0.7860373258590698,
875
+ 0.3810654878616333,
876
+ 0.404765784740448,
877
+ 0.5835322141647339,
878
+ 0.3631921410560608,
879
+ 0.5430107712745667,
880
+ 0.4888167381286621,
881
+ 0.5309621095657349,
882
+ 1.167321801185608,
883
+ 0.5996302962303162,
884
+ 0.8869525790214539,
885
+ 2.478888750076294,
886
+ 0.23427508771419525,
887
+ 0.6303129196166992,
888
+ 0.6949453353881836,
889
+ 0.4048846364021301,
890
+ 1.1991347074508667,
891
+ 1.8658772706985474,
892
+ 0.2984611392021179,
893
+ 3.9312291145324707
894
+ ],
895
+ "norm_high_t": [
896
+ 2.6291065216064453,
897
+ 0.7011670470237732,
898
+ 0.494642436504364,
899
+ 0.7701104283332825,
900
+ 2.5342204570770264,
901
+ 5.168290615081787,
902
+ 0.9434123635292053,
903
+ 3.9651055335998535,
904
+ 8.533324241638184,
905
+ 7.611879348754883,
906
+ 1.9443340301513672,
907
+ 4.746319770812988,
908
+ 2.965162754058838,
909
+ 5.064094066619873,
910
+ 4.081468105316162,
911
+ 0.6930497288703918,
912
+ 3.280083656311035,
913
+ 3.307518243789673,
914
+ 0.9383264183998108,
915
+ 2.1687474250793457
916
+ ],
917
+ "lowhigh_cos": [
918
+ 0.2833213210105896,
919
+ 0.22898192703723907,
920
+ 0.5208947062492371,
921
+ 0.6059155464172363,
922
+ -0.015384141355752945,
923
+ 0.12832285463809967,
924
+ 0.15682637691497803,
925
+ 0.2393011450767517,
926
+ -0.10133591294288635,
927
+ -0.11736318469047546,
928
+ 0.0006828118348494172,
929
+ 0.6370846629142761,
930
+ 0.10531243681907654,
931
+ 0.07136750221252441,
932
+ 0.45528534054756165,
933
+ 0.6974369883537292,
934
+ 0.5258194804191589,
935
+ 0.3584192991256714,
936
+ 0.2281290590763092,
937
+ 0.23442195355892181
938
+ ]
939
+ },
940
+ "frozen_backbone": {
941
+ "iter": [
942
+ 10,
943
+ 20,
944
+ 30,
945
+ 40,
946
+ 50,
947
+ 60,
948
+ 70,
949
+ 80,
950
+ 90,
951
+ 100,
952
+ 110,
953
+ 120,
954
+ 130,
955
+ 140,
956
+ 150,
957
+ 160,
958
+ 170,
959
+ 180,
960
+ 190,
961
+ 200,
962
+ 210,
963
+ 220,
964
+ 230,
965
+ 240,
966
+ 250,
967
+ 260,
968
+ 270,
969
+ 280,
970
+ 290,
971
+ 300,
972
+ 310,
973
+ 320,
974
+ 330,
975
+ 340,
976
+ 350,
977
+ 360,
978
+ 370,
979
+ 380,
980
+ 390,
981
+ 400,
982
+ 410,
983
+ 420,
984
+ 430,
985
+ 440,
986
+ 450,
987
+ 460,
988
+ 470,
989
+ 480,
990
+ 490,
991
+ 500
992
+ ],
993
+ "loss": [
994
+ 0.8432541787624359,
995
+ 0.6051806628704071,
996
+ 0.9249359101057053,
997
+ 0.5307856887578964,
998
+ 0.4062101155519485,
999
+ 1.2530769169330598,
1000
+ 0.9248880535364151,
1001
+ 0.6459330469369888,
1002
+ 0.3750177398324013,
1003
+ 0.4880190521478653,
1004
+ 0.5768440961837769,
1005
+ 0.34753591269254686,
1006
+ 0.46058507561683654,
1007
+ 0.5233289957046509,
1008
+ 0.49718080908060075,
1009
+ 0.5780578717589379,
1010
+ 0.3984171137213707,
1011
+ 0.28467713445425036,
1012
+ 0.21674730628728867,
1013
+ 0.19075852558016776,
1014
+ 0.19515338838100432,
1015
+ 0.24626194760203363,
1016
+ 0.3154190480709076,
1017
+ 0.2720298781991005,
1018
+ 0.23943385034799575,
1019
+ 0.1497550018131733,
1020
+ 0.18967929482460022,
1021
+ 0.12926405742764474,
1022
+ 0.18151819109916686,
1023
+ 0.2395605966448784,
1024
+ 0.198850055038929,
1025
+ 0.30438809990882876,
1026
+ 0.21534736976027488,
1027
+ 0.22932147979736328,
1028
+ 0.19919342622160913,
1029
+ 0.18485659509897232,
1030
+ 0.2822298489511013,
1031
+ 0.1751813419163227,
1032
+ 0.1328229382634163,
1033
+ 0.1326775722205639,
1034
+ 0.11731504946947098,
1035
+ 0.14120452776551246,
1036
+ 0.12244693860411644,
1037
+ 0.16777455508708955,
1038
+ 0.1694161780178547,
1039
+ 0.1586403626948595,
1040
+ 0.10118742138147355,
1041
+ 0.14438418820500373,
1042
+ 0.12352714017033577,
1043
+ 0.15858156234025955
1044
+ ],
1045
+ "mean_return": [
1046
+ 14.802000000000001,
1047
+ 9.073000000000002,
1048
+ 13.052000000000003,
1049
+ 1.3720000000000057,
1050
+ 4.766000000000008,
1051
+ 10.137,
1052
+ 8.38700000000001,
1053
+ -3.6769999999999974,
1054
+ 6.716999999999999,
1055
+ 1.3950000000000033,
1056
+ -0.9859999999999903,
1057
+ 1.7280000000000055,
1058
+ 2.1439999999999984,
1059
+ 2.2280000000000033,
1060
+ -2.8790000000000013,
1061
+ -3.4519999999999955,
1062
+ -0.8159999999999996,
1063
+ 1.6260000000000012,
1064
+ -1.531999999999989,
1065
+ -4.665999999999986,
1066
+ -1.4189999999999887,
1067
+ -0.8670000000000023,
1068
+ -0.5189999999999984,
1069
+ -0.9310000000000024,
1070
+ -0.8409999999999933,
1071
+ 0.9580000000000105,
1072
+ -3.4829999999999908,
1073
+ 0.35500000000001464,
1074
+ 5.418000000000001,
1075
+ -3.9919999999999867,
1076
+ -1.30199999999999,
1077
+ -0.7849999999999947,
1078
+ -1.5739999999999916,
1079
+ -3.1340000000000017,
1080
+ -1.602999999999985,
1081
+ -3.8719999999999914,
1082
+ -3.574999999999995,
1083
+ -4.1189999999999936,
1084
+ 3.8230000000000004,
1085
+ 1.6980000000000022,
1086
+ -0.9009999999999941,
1087
+ 9.245000000000001,
1088
+ 0.9290000000000103,
1089
+ -3.066999999999987,
1090
+ -2.8059999999999956,
1091
+ -3.5890000000000013,
1092
+ -0.34999999999999437,
1093
+ -0.921000000000002,
1094
+ -4.478999999999983,
1095
+ -3.7299999999999947
1096
+ ],
1097
+ "win_rate_episode": [
1098
+ 0.7,
1099
+ 0.5,
1100
+ 0.6,
1101
+ 0.2,
1102
+ 0.3,
1103
+ 0.5,
1104
+ 0.5,
1105
+ 0.0,
1106
+ 0.4,
1107
+ 0.2,
1108
+ 0.1,
1109
+ 0.2,
1110
+ 0.2,
1111
+ 0.2,
1112
+ 0.0,
1113
+ 0.0,
1114
+ 0.1,
1115
+ 0.2,
1116
+ 0.1,
1117
+ 0.0,
1118
+ 0.1,
1119
+ 0.1,
1120
+ 0.1,
1121
+ 0.1,
1122
+ 0.1,
1123
+ 0.2,
1124
+ 0.0,
1125
+ 0.2,
1126
+ 0.3,
1127
+ 0.0,
1128
+ 0.1,
1129
+ 0.1,
1130
+ 0.1,
1131
+ 0.0,
1132
+ 0.1,
1133
+ 0.0,
1134
+ 0.0,
1135
+ 0.0,
1136
+ 0.3,
1137
+ 0.2,
1138
+ 0.1,
1139
+ 0.5,
1140
+ 0.2,
1141
+ 0.0,
1142
+ 0.0,
1143
+ 0.0,
1144
+ 0.1,
1145
+ 0.1,
1146
+ 0.0,
1147
+ 0.0
1148
+ ],
1149
+ "id_winrate": [
1150
+ 0.525,
1151
+ 0.3,
1152
+ 0.07500000000000001,
1153
+ 0.05,
1154
+ 0.05
1155
+ ],
1156
+ "id_winrate_iter": [
1157
+ 100,
1158
+ 200,
1159
+ 300,
1160
+ 400,
1161
+ 500
1162
+ ],
1163
+ "ood_winrate": [
1164
+ 0.03333333333333333
1165
+ ],
1166
+ "ood_winrate_iter": [
1167
+ 500
1168
+ ],
1169
+ "grad_align_iter": [],
1170
+ "grad_align": [],
1171
+ "rl_grad_norm": [],
1172
+ "bc_grad_norm": [],
1173
+ "repr_drift_iter": [
1174
+ 25,
1175
+ 50,
1176
+ 75,
1177
+ 100,
1178
+ 125,
1179
+ 150,
1180
+ 175,
1181
+ 200,
1182
+ 225,
1183
+ 250,
1184
+ 275,
1185
+ 300,
1186
+ 325,
1187
+ 350,
1188
+ 375,
1189
+ 400,
1190
+ 425,
1191
+ 450,
1192
+ 475,
1193
+ 500
1194
+ ],
1195
+ "repr_drift": [
1196
+ 0.15479625761508942,
1197
+ 0.484932541847229,
1198
+ 0.5217902660369873,
1199
+ 0.7290679216384888,
1200
+ 0.5376307964324951,
1201
+ 0.8892215490341187,
1202
+ 1.3117743730545044,
1203
+ 1.452524185180664,
1204
+ 2.022955894470215,
1205
+ 1.654375433921814,
1206
+ 2.462423324584961,
1207
+ 2.5157599449157715,
1208
+ 2.2190136909484863,
1209
+ 2.9022059440612793,
1210
+ 2.673858880996704,
1211
+ 2.663733959197998,
1212
+ 4.088006019592285,
1213
+ 3.895555019378662,
1214
+ 4.047849655151367,
1215
+ 4.74190616607666
1216
+ ],
1217
+ "t_analysis_iter": [
1218
+ 25,
1219
+ 50,
1220
+ 75,
1221
+ 100,
1222
+ 125,
1223
+ 150,
1224
+ 175,
1225
+ 200,
1226
+ 225,
1227
+ 250,
1228
+ 275,
1229
+ 300,
1230
+ 325,
1231
+ 350,
1232
+ 375,
1233
+ 400,
1234
+ 425,
1235
+ 450,
1236
+ 475,
1237
+ 500
1238
+ ],
1239
+ "norm_low_t": [
1240
+ 0.5986595153808594,
1241
+ 0.6866065859794617,
1242
+ 0.42266008257865906,
1243
+ 11.257745742797852,
1244
+ 0.4751920998096466,
1245
+ 0.729705810546875,
1246
+ 0.45668742060661316,
1247
+ 0.5257764458656311,
1248
+ 0.3505500853061676,
1249
+ 0.5213647484779358,
1250
+ 0.4264509975910187,
1251
+ 0.3384961485862732,
1252
+ 0.27900227904319763,
1253
+ 0.2823605239391327,
1254
+ 0.1809273511171341,
1255
+ 0.3805403411388397,
1256
+ 0.26565390825271606,
1257
+ 3.993173122406006,
1258
+ 0.6758583188056946,
1259
+ 0.3116438686847687
1260
+ ],
1261
+ "norm_high_t": [
1262
+ 0.968696117401123,
1263
+ 14.156644821166992,
1264
+ 0.6386823654174805,
1265
+ 2.8380041122436523,
1266
+ 3.138822078704834,
1267
+ 1.537684440612793,
1268
+ 0.7374914884567261,
1269
+ 0.3873365819454193,
1270
+ 0.6569968461990356,
1271
+ 1.7117724418640137,
1272
+ 0.4027094542980194,
1273
+ 2.9362730979919434,
1274
+ 0.6246607899665833,
1275
+ 1.5652388334274292,
1276
+ 0.2736610174179077,
1277
+ 4.938521385192871,
1278
+ 1.3465253114700317,
1279
+ 2.811922550201416,
1280
+ 2.560199022293091,
1281
+ 0.773190975189209
1282
+ ],
1283
+ "lowhigh_cos": [
1284
+ 0.34187427163124084,
1285
+ -0.013280068524181843,
1286
+ 0.1718188226222992,
1287
+ 0.4921704828739166,
1288
+ 0.4253537058830261,
1289
+ 0.10736553370952606,
1290
+ 0.10130278766155243,
1291
+ 0.36192867159843445,
1292
+ 0.06617449969053268,
1293
+ 0.31431812047958374,
1294
+ 0.14055868983268738,
1295
+ -0.09543662518262863,
1296
+ 0.15687131881713867,
1297
+ 0.1258467733860016,
1298
+ 0.18483635783195496,
1299
+ 0.4652028977870941,
1300
+ -0.2167549729347229,
1301
+ 0.44112205505371094,
1302
+ 0.4113374948501587,
1303
+ 0.02761063352227211
1304
+ ]
1305
+ },
1306
+ "bc_on_wins": {
1307
+ "iter": [
1308
+ 10,
1309
+ 20,
1310
+ 30,
1311
+ 40,
1312
+ 50,
1313
+ 60,
1314
+ 70,
1315
+ 80,
1316
+ 90,
1317
+ 100,
1318
+ 110,
1319
+ 120,
1320
+ 130,
1321
+ 140,
1322
+ 150,
1323
+ 160,
1324
+ 170,
1325
+ 180,
1326
+ 190,
1327
+ 200,
1328
+ 210,
1329
+ 220,
1330
+ 230,
1331
+ 240,
1332
+ 250,
1333
+ 260,
1334
+ 270,
1335
+ 280,
1336
+ 290,
1337
+ 300,
1338
+ 310,
1339
+ 320,
1340
+ 330,
1341
+ 340,
1342
+ 350,
1343
+ 360,
1344
+ 370,
1345
+ 380,
1346
+ 390,
1347
+ 400,
1348
+ 410,
1349
+ 420,
1350
+ 430,
1351
+ 440,
1352
+ 450,
1353
+ 460,
1354
+ 470,
1355
+ 480,
1356
+ 490,
1357
+ 500
1358
+ ],
1359
+ "loss": [
1360
+ 0.0,
1361
+ 2.809250086545944,
1362
+ 1.5345448434352875,
1363
+ 1.3774227797985077,
1364
+ 0.273484418541193,
1365
+ 0.09973169527947903,
1366
+ 0.0766332889907062,
1367
+ 0.06333817462436855,
1368
+ 0.305202372930944,
1369
+ 0.13767885118722917,
1370
+ 0.13696757480502128,
1371
+ 0.12410348504781724,
1372
+ 0.2557923704385757,
1373
+ 0.223702372610569,
1374
+ 0.11811406463384629,
1375
+ 0.1622357800602913,
1376
+ 0.1446425747126341,
1377
+ 0.06279170280322433,
1378
+ 0.060602336190640926,
1379
+ 0.12183188050985336,
1380
+ 0.12065998539328575,
1381
+ 0.07745412643998861,
1382
+ 0.1753513276576996,
1383
+ 0.15074485763907433,
1384
+ 0.22228444442152978,
1385
+ 0.24410116225481032,
1386
+ 0.25474355965852735,
1387
+ 0.15105561465024947,
1388
+ 0.1774226851761341,
1389
+ 0.18747113943099974,
1390
+ 0.11809183433651924,
1391
+ 0.2158730670809746,
1392
+ 0.13357947915792465,
1393
+ 0.14638029038906097,
1394
+ 0.1506120301783085,
1395
+ 0.10705487877130508,
1396
+ 0.12562319375574588,
1397
+ 0.08112300224602223,
1398
+ 0.1059636577963829,
1399
+ 0.11252538859844208,
1400
+ 0.07254688572138548,
1401
+ 0.0704629324376583,
1402
+ 0.08125015869736671,
1403
+ 0.11510146819055081,
1404
+ 0.048482533730566504,
1405
+ 0.0742773407138884,
1406
+ 0.056103642005473374,
1407
+ 0.10363874156028033,
1408
+ 0.1402372971177101,
1409
+ 0.08852829486131668
1410
+ ],
1411
+ "mean_return": [
1412
+ 9.337000000000007,
1413
+ 3.8970000000000082,
1414
+ 7.945000000000002,
1415
+ -4.483999999999985,
1416
+ -0.8439999999999941,
1417
+ -4.142999999999989,
1418
+ 1.8410000000000093,
1419
+ -3.6869999999999954,
1420
+ 0.5000000000000071,
1421
+ -3.57399999999999,
1422
+ 0.41500000000002013,
1423
+ 1.5700000000000036,
1424
+ -1.10499999999999,
1425
+ -3.3829999999999965,
1426
+ -4.688999999999977,
1427
+ -2.5409999999999995,
1428
+ -1.394999999999992,
1429
+ -3.8249999999999913,
1430
+ -3.151999999999993,
1431
+ -2.763999999999993,
1432
+ 1.6550000000000076,
1433
+ -2.5829999999999993,
1434
+ -0.6139999999999803,
1435
+ 0.6570000000000067,
1436
+ -2.7069999999999856,
1437
+ 7.194,
1438
+ 0.389000000000001,
1439
+ -3.9149999999999836,
1440
+ -3.1179999999999954,
1441
+ -3.7479999999999913,
1442
+ 0.7230000000000114,
1443
+ -3.7059999999999933,
1444
+ -3.053999999999985,
1445
+ -3.2609999999999966,
1446
+ -0.3719999999999978,
1447
+ -3.333999999999996,
1448
+ -0.9279999999999944,
1449
+ -0.3180000000000021,
1450
+ 4.698000000000009,
1451
+ -1.359999999999996,
1452
+ -1.7269999999999768,
1453
+ -0.8929999999999929,
1454
+ -2.981999999999995,
1455
+ -1.4949999999999903,
1456
+ -0.6849999999999985,
1457
+ -1.5049999999999946,
1458
+ -3.3639999999999937,
1459
+ -2.5750000000000006,
1460
+ -1.2899999999999896,
1461
+ -2.8209999999999935
1462
+ ],
1463
+ "win_rate_episode": [
1464
+ 0.5,
1465
+ 0.3,
1466
+ 0.4,
1467
+ 0.0,
1468
+ 0.1,
1469
+ 0.0,
1470
+ 0.2,
1471
+ 0.0,
1472
+ 0.1,
1473
+ 0.0,
1474
+ 0.2,
1475
+ 0.2,
1476
+ 0.1,
1477
+ 0.0,
1478
+ 0.0,
1479
+ 0.0,
1480
+ 0.1,
1481
+ 0.0,
1482
+ 0.0,
1483
+ 0.0,
1484
+ 0.2,
1485
+ 0.0,
1486
+ 0.1,
1487
+ 0.2,
1488
+ 0.0,
1489
+ 0.4,
1490
+ 0.1,
1491
+ 0.0,
1492
+ 0.0,
1493
+ 0.0,
1494
+ 0.2,
1495
+ 0.0,
1496
+ 0.0,
1497
+ 0.0,
1498
+ 0.1,
1499
+ 0.0,
1500
+ 0.1,
1501
+ 0.1,
1502
+ 0.3,
1503
+ 0.1,
1504
+ 0.1,
1505
+ 0.1,
1506
+ 0.0,
1507
+ 0.1,
1508
+ 0.1,
1509
+ 0.1,
1510
+ 0.0,
1511
+ 0.0,
1512
+ 0.1,
1513
+ 0.0
1514
+ ],
1515
+ "id_winrate": [
1516
+ 0.05,
1517
+ 0.07500000000000001,
1518
+ 0.07500000000000001,
1519
+ 0.05,
1520
+ 0.07500000000000001
1521
+ ],
1522
+ "id_winrate_iter": [
1523
+ 100,
1524
+ 200,
1525
+ 300,
1526
+ 400,
1527
+ 500
1528
+ ],
1529
+ "ood_winrate": [
1530
+ 0.0
1531
+ ],
1532
+ "ood_winrate_iter": [
1533
+ 500
1534
+ ],
1535
+ "grad_align_iter": [
1536
+ 25,
1537
+ 50,
1538
+ 75,
1539
+ 100,
1540
+ 125,
1541
+ 150,
1542
+ 175,
1543
+ 200,
1544
+ 225,
1545
+ 250,
1546
+ 275,
1547
+ 300,
1548
+ 325,
1549
+ 350,
1550
+ 375,
1551
+ 400,
1552
+ 425,
1553
+ 450,
1554
+ 475,
1555
+ 500
1556
+ ],
1557
+ "grad_align": [
1558
+ -0.15585047006607056,
1559
+ 0.014388328418135643,
1560
+ -0.0008289138786494732,
1561
+ 0.009685968980193138,
1562
+ -0.14255520701408386,
1563
+ -0.03257212042808533,
1564
+ 0.15654288232326508,
1565
+ 0.008197564631700516,
1566
+ -0.11398962885141373,
1567
+ 0.017725320532917976,
1568
+ -0.0008931104093790054,
1569
+ 0.03191083297133446,
1570
+ -0.055659446865320206,
1571
+ -0.0731707215309143,
1572
+ 0.08635709434747696,
1573
+ -0.0048180073499679565,
1574
+ 0.05373326316475868,
1575
+ -0.051411353051662445,
1576
+ 0.031434930860996246,
1577
+ 0.010565715841948986
1578
+ ],
1579
+ "rl_grad_norm": [
1580
+ 17.90909194946289,
1581
+ 9.267593383789062,
1582
+ 2.835970401763916,
1583
+ 1.5497463941574097,
1584
+ 0.648881196975708,
1585
+ 0.17218180000782013,
1586
+ 1.2748949527740479,
1587
+ 1.3406476974487305,
1588
+ 2.2597739696502686,
1589
+ 1.3221700191497803,
1590
+ 1.6656370162963867,
1591
+ 1.313543677330017,
1592
+ 1.1664280891418457,
1593
+ 1.290276288986206,
1594
+ 0.5993183255195618,
1595
+ 1.2788653373718262,
1596
+ 2.9476051330566406,
1597
+ 0.33282503485679626,
1598
+ 0.1632971167564392,
1599
+ 1.2346807718276978
1600
+ ],
1601
+ "bc_grad_norm": [
1602
+ 37.38460922241211,
1603
+ 57.11834716796875,
1604
+ 55.3564338684082,
1605
+ 43.19457244873047,
1606
+ 42.32614517211914,
1607
+ 49.10124588012695,
1608
+ 44.5236930847168,
1609
+ 49.081302642822266,
1610
+ 49.36444854736328,
1611
+ 46.51043701171875,
1612
+ 42.867820739746094,
1613
+ 46.32526779174805,
1614
+ 41.31015396118164,
1615
+ 45.431243896484375,
1616
+ 40.98322677612305,
1617
+ 42.76980972290039,
1618
+ 46.33472442626953,
1619
+ 45.720069885253906,
1620
+ 51.702796936035156,
1621
+ 53.660099029541016
1622
+ ],
1623
+ "repr_drift_iter": [
1624
+ 25,
1625
+ 50,
1626
+ 75,
1627
+ 100,
1628
+ 125,
1629
+ 150,
1630
+ 175,
1631
+ 200,
1632
+ 225,
1633
+ 250,
1634
+ 275,
1635
+ 300,
1636
+ 325,
1637
+ 350,
1638
+ 375,
1639
+ 400,
1640
+ 425,
1641
+ 450,
1642
+ 475,
1643
+ 500
1644
+ ],
1645
+ "repr_drift": [
1646
+ 6.027996063232422,
1647
+ 6.684228897094727,
1648
+ 11.888472557067871,
1649
+ 15.770934104919434,
1650
+ 19.830795288085938,
1651
+ 17.60736083984375,
1652
+ 14.056540489196777,
1653
+ 13.429742813110352,
1654
+ 13.57545280456543,
1655
+ 13.590494155883789,
1656
+ 14.123512268066406,
1657
+ 14.357796669006348,
1658
+ 15.667505264282227,
1659
+ 13.821261405944824,
1660
+ 14.207181930541992,
1661
+ 13.969894409179688,
1662
+ 14.930856704711914,
1663
+ 14.994295120239258,
1664
+ 14.834063529968262,
1665
+ 14.4537992477417
1666
+ ],
1667
+ "t_analysis_iter": [
1668
+ 25,
1669
+ 50,
1670
+ 75,
1671
+ 100,
1672
+ 125,
1673
+ 150,
1674
+ 175,
1675
+ 200,
1676
+ 225,
1677
+ 250,
1678
+ 275,
1679
+ 300,
1680
+ 325,
1681
+ 350,
1682
+ 375,
1683
+ 400,
1684
+ 425,
1685
+ 450,
1686
+ 475,
1687
+ 500
1688
+ ],
1689
+ "norm_low_t": [
1690
+ 20.25221824645996,
1691
+ 0.754869282245636,
1692
+ 0.17779888212680817,
1693
+ 3.0701024532318115,
1694
+ 1.7086597681045532,
1695
+ 2.237590789794922,
1696
+ 5.5478386878967285,
1697
+ 6.110817909240723,
1698
+ 0.46025317907333374,
1699
+ 0.11583580821752548,
1700
+ 3.8377528190612793,
1701
+ 0.8099531531333923,
1702
+ 1.5192604064941406,
1703
+ 5.958379745483398,
1704
+ 0.4879041016101837,
1705
+ 1.4312430620193481,
1706
+ 0.4303628206253052,
1707
+ 0.37884828448295593,
1708
+ 0.46471869945526123,
1709
+ 2.001178741455078
1710
+ ],
1711
+ "norm_high_t": [
1712
+ 21.141122817993164,
1713
+ 3.2488327026367188,
1714
+ 3.682577610015869,
1715
+ 2.676034450531006,
1716
+ 1.2364305257797241,
1717
+ 2.0812149047851562,
1718
+ 1.543370008468628,
1719
+ 2.039804220199585,
1720
+ 1.3472521305084229,
1721
+ 2.179379463195801,
1722
+ 5.384540557861328,
1723
+ 1.6266305446624756,
1724
+ 0.9822807908058167,
1725
+ 1.5289103984832764,
1726
+ 2.2616400718688965,
1727
+ 2.6361918449401855,
1728
+ 0.9600972533226013,
1729
+ 0.442050963640213,
1730
+ 0.3930566608905792,
1731
+ 0.7594590783119202
1732
+ ],
1733
+ "lowhigh_cos": [
1734
+ 0.46482086181640625,
1735
+ -0.018056275323033333,
1736
+ 0.2649460434913635,
1737
+ 0.2176809012889862,
1738
+ 0.21259765326976776,
1739
+ 0.340675413608551,
1740
+ -0.17475268244743347,
1741
+ 0.46381503343582153,
1742
+ 0.295942097902298,
1743
+ 0.35303354263305664,
1744
+ 0.3157437741756439,
1745
+ 0.15970274806022644,
1746
+ 0.5105319023132324,
1747
+ 0.3774409294128418,
1748
+ 0.22727873921394348,
1749
+ 0.20067614316940308,
1750
+ 0.2528092861175537,
1751
+ 0.4246519207954407,
1752
+ -0.24365077912807465,
1753
+ 0.4615093469619751
1754
+ ]
1755
+ },
1756
+ "low_t_only": {
1757
+ "iter": [
1758
+ 10,
1759
+ 20,
1760
+ 30,
1761
+ 40,
1762
+ 50,
1763
+ 60,
1764
+ 70,
1765
+ 80,
1766
+ 90,
1767
+ 100,
1768
+ 110,
1769
+ 120,
1770
+ 130,
1771
+ 140,
1772
+ 150,
1773
+ 160,
1774
+ 170,
1775
+ 180,
1776
+ 190,
1777
+ 200,
1778
+ 210,
1779
+ 220,
1780
+ 230,
1781
+ 240,
1782
+ 250,
1783
+ 260,
1784
+ 270,
1785
+ 280,
1786
+ 290,
1787
+ 300,
1788
+ 310,
1789
+ 320,
1790
+ 330,
1791
+ 340,
1792
+ 350,
1793
+ 360,
1794
+ 370,
1795
+ 380,
1796
+ 390,
1797
+ 400,
1798
+ 410,
1799
+ 420,
1800
+ 430,
1801
+ 440,
1802
+ 450,
1803
+ 460,
1804
+ 470,
1805
+ 480,
1806
+ 490,
1807
+ 500
1808
+ ],
1809
+ "loss": [
1810
+ 0.16954294890165328,
1811
+ 0.12819076813757418,
1812
+ 0.17230339162051678,
1813
+ 0.09248451106250286,
1814
+ 0.16420554406940938,
1815
+ 0.1693354170769453,
1816
+ 0.5027543395757675,
1817
+ 0.5454125240445137,
1818
+ 0.18514752238988877,
1819
+ 0.07633375357836485,
1820
+ 0.29205870665609834,
1821
+ 0.10658948160707951,
1822
+ 0.1487645037472248,
1823
+ 0.07421774212270975,
1824
+ 0.1682606603950262,
1825
+ 0.03632904961705208,
1826
+ 0.0382061586715281,
1827
+ 0.02044668570160866,
1828
+ 0.07106257285922765,
1829
+ 0.08296542745083571,
1830
+ 0.053635398391634226,
1831
+ 0.17159658959135413,
1832
+ 0.06913567988667638,
1833
+ 0.02520559049444273,
1834
+ 0.023581319977529346,
1835
+ 0.024397523747757076,
1836
+ 0.01780195008032024,
1837
+ 0.05103614148683846,
1838
+ 0.007415541564114392,
1839
+ 0.024072827032068745,
1840
+ 0.17588082720176318,
1841
+ 0.041048154188320043,
1842
+ 0.01608952531169052,
1843
+ 0.012731735737179405,
1844
+ 0.08324823407456279,
1845
+ 0.006074277223888203,
1846
+ 0.0398330731986789,
1847
+ 0.015381297026760877,
1848
+ 0.0053429307314218025,
1849
+ 0.006615792286174838,
1850
+ 0.00928727765185613,
1851
+ 0.006404643525092979,
1852
+ 0.0021499938324268443,
1853
+ 0.003018749266720988,
1854
+ 0.003801402702447376,
1855
+ 0.025576220805851334,
1856
+ 0.010855860910396586,
1857
+ 0.008461204392369836,
1858
+ 0.0035062126145021465,
1859
+ 0.003927168410518789
1860
+ ],
1861
+ "mean_return": [
1862
+ 6.852000000000008,
1863
+ 12.036000000000005,
1864
+ 4.393000000000008,
1865
+ 9.750999999999996,
1866
+ -0.7949999999999873,
1867
+ 0.3310000000000041,
1868
+ -0.39499999999999486,
1869
+ -0.8029999999999967,
1870
+ -4.120999999999986,
1871
+ -1.315999999999994,
1872
+ -1.9249999999999883,
1873
+ -3.5099999999999896,
1874
+ -4.153999999999989,
1875
+ -3.610999999999991,
1876
+ -1.2209999999999934,
1877
+ -3.4219999999999944,
1878
+ -3.9589999999999903,
1879
+ -1.556999999999995,
1880
+ -1.2349999999999972,
1881
+ -0.8050000000000004,
1882
+ -4.293999999999988,
1883
+ 0.9720000000000105,
1884
+ -3.571999999999993,
1885
+ -4.081999999999988,
1886
+ -1.518999999999984,
1887
+ -1.2259999999999966,
1888
+ -0.10999999999999766,
1889
+ -3.3099999999999907,
1890
+ -3.8639999999999963,
1891
+ -3.451999999999996,
1892
+ 0.8820000000000116,
1893
+ 2.355,
1894
+ 0.2150000000000046,
1895
+ -4.427999999999989,
1896
+ -4.061999999999985,
1897
+ -3.813999999999994,
1898
+ -2.9899999999999984,
1899
+ -1.2689999999999966,
1900
+ -4.1039999999999885,
1901
+ -3.571999999999997,
1902
+ -0.5320000000000011,
1903
+ -3.539999999999999,
1904
+ -0.7389999999999958,
1905
+ -3.6379999999999995,
1906
+ -2.521999999999996,
1907
+ -2.839999999999999,
1908
+ 2.3600000000000034,
1909
+ -1.5089999999999921,
1910
+ -4.397999999999984,
1911
+ -4.4839999999999876
1912
+ ],
1913
+ "win_rate_episode": [
1914
+ 0.4,
1915
+ 0.6,
1916
+ 0.3,
1917
+ 0.5,
1918
+ 0.1,
1919
+ 0.1,
1920
+ 0.1,
1921
+ 0.1,
1922
+ 0.0,
1923
+ 0.1,
1924
+ 0.1,
1925
+ 0.0,
1926
+ 0.0,
1927
+ 0.0,
1928
+ 0.1,
1929
+ 0.0,
1930
+ 0.0,
1931
+ 0.1,
1932
+ 0.1,
1933
+ 0.1,
1934
+ 0.0,
1935
+ 0.2,
1936
+ 0.0,
1937
+ 0.0,
1938
+ 0.1,
1939
+ 0.1,
1940
+ 0.1,
1941
+ 0.0,
1942
+ 0.0,
1943
+ 0.0,
1944
+ 0.2,
1945
+ 0.2,
1946
+ 0.1,
1947
+ 0.0,
1948
+ 0.0,
1949
+ 0.0,
1950
+ 0.0,
1951
+ 0.1,
1952
+ 0.0,
1953
+ 0.0,
1954
+ 0.1,
1955
+ 0.0,
1956
+ 0.1,
1957
+ 0.0,
1958
+ 0.0,
1959
+ 0.0,
1960
+ 0.2,
1961
+ 0.1,
1962
+ 0.0,
1963
+ 0.0
1964
+ ],
1965
+ "id_winrate": [
1966
+ 0.05,
1967
+ 0.05,
1968
+ 0.0,
1969
+ 0.1,
1970
+ 0.025
1971
+ ],
1972
+ "id_winrate_iter": [
1973
+ 100,
1974
+ 200,
1975
+ 300,
1976
+ 400,
1977
+ 500
1978
+ ],
1979
+ "ood_winrate": [
1980
+ 0.0
1981
+ ],
1982
+ "ood_winrate_iter": [
1983
+ 500
1984
+ ],
1985
+ "grad_align_iter": [
1986
+ 25,
1987
+ 50,
1988
+ 75,
1989
+ 100,
1990
+ 125,
1991
+ 150,
1992
+ 175,
1993
+ 200,
1994
+ 225,
1995
+ 250,
1996
+ 275,
1997
+ 300,
1998
+ 325,
1999
+ 350,
2000
+ 375,
2001
+ 400,
2002
+ 425,
2003
+ 450,
2004
+ 475,
2005
+ 500
2006
+ ],
2007
+ "grad_align": [
2008
+ 0.2528616189956665,
2009
+ -0.03359116613864899,
2010
+ 0.02611994370818138,
2011
+ 0.0035057030618190765,
2012
+ 0.07301062345504761,
2013
+ -0.05015534162521362,
2014
+ 0.04849525913596153,
2015
+ 0.04365059360861778,
2016
+ 0.16888359189033508,
2017
+ 0.038295499980449677,
2018
+ -0.020476030185818672,
2019
+ 0.11001922190189362,
2020
+ -0.03407794237136841,
2021
+ -0.2625855803489685,
2022
+ 0.04202434420585632,
2023
+ 0.1309511661529541,
2024
+ -0.006483441684395075,
2025
+ 0.005877040326595306,
2026
+ -0.003088176716119051,
2027
+ -0.007350331172347069
2028
+ ],
2029
+ "rl_grad_norm": [
2030
+ 0.708732008934021,
2031
+ 4.030664920806885,
2032
+ 4.342776298522949,
2033
+ 0.7132593989372253,
2034
+ 3.507981061935425,
2035
+ 1.6703108549118042,
2036
+ 2.1692845821380615,
2037
+ 1.730566143989563,
2038
+ 0.6206149458885193,
2039
+ 1.1632376909255981,
2040
+ 4.863556385040283,
2041
+ 0.43041908740997314,
2042
+ 0.34915438294410706,
2043
+ 3.184418201446533,
2044
+ 3.9088478088378906,
2045
+ 0.632773220539093,
2046
+ 0.3219628930091858,
2047
+ 0.058308567851781845,
2048
+ 7.77617073059082,
2049
+ 0.43638089299201965
2050
+ ],
2051
+ "bc_grad_norm": [
2052
+ 19.74188232421875,
2053
+ 21.830142974853516,
2054
+ 22.09646224975586,
2055
+ 21.381101608276367,
2056
+ 26.10833168029785,
2057
+ 24.06924819946289,
2058
+ 23.066692352294922,
2059
+ 26.531723022460938,
2060
+ 23.48773765563965,
2061
+ 18.655624389648438,
2062
+ 22.68110466003418,
2063
+ 23.721654891967773,
2064
+ 26.712148666381836,
2065
+ 38.09318542480469,
2066
+ 13.851006507873535,
2067
+ 25.637195587158203,
2068
+ 22.518232345581055,
2069
+ 26.217267990112305,
2070
+ 20.741737365722656,
2071
+ 22.005754470825195
2072
+ ],
2073
+ "repr_drift_iter": [
2074
+ 25,
2075
+ 50,
2076
+ 75,
2077
+ 100,
2078
+ 125,
2079
+ 150,
2080
+ 175,
2081
+ 200,
2082
+ 225,
2083
+ 250,
2084
+ 275,
2085
+ 300,
2086
+ 325,
2087
+ 350,
2088
+ 375,
2089
+ 400,
2090
+ 425,
2091
+ 450,
2092
+ 475,
2093
+ 500
2094
+ ],
2095
+ "repr_drift": [
2096
+ 1.8782662153244019,
2097
+ 2.9116997718811035,
2098
+ 2.912844181060791,
2099
+ 3.6854569911956787,
2100
+ 3.0491065979003906,
2101
+ 4.025200366973877,
2102
+ 4.199270248413086,
2103
+ 4.503491401672363,
2104
+ 3.1839468479156494,
2105
+ 3.253112554550171,
2106
+ 3.8597023487091064,
2107
+ 5.276954650878906,
2108
+ 4.396190643310547,
2109
+ 3.5537588596343994,
2110
+ 2.945570707321167,
2111
+ 4.853464126586914,
2112
+ 4.4204020500183105,
2113
+ 4.409016132354736,
2114
+ 5.554651260375977,
2115
+ 4.029665470123291
2116
+ ],
2117
+ "t_analysis_iter": [
2118
+ 25,
2119
+ 50,
2120
+ 75,
2121
+ 100,
2122
+ 125,
2123
+ 150,
2124
+ 175,
2125
+ 200,
2126
+ 225,
2127
+ 250,
2128
+ 275,
2129
+ 300,
2130
+ 325,
2131
+ 350,
2132
+ 375,
2133
+ 400,
2134
+ 425,
2135
+ 450,
2136
+ 475,
2137
+ 500
2138
+ ],
2139
+ "norm_low_t": [
2140
+ 0.8107094168663025,
2141
+ 0.5331999659538269,
2142
+ 0.5337082147598267,
2143
+ 0.4895950257778168,
2144
+ 2.713630199432373,
2145
+ 0.49118441343307495,
2146
+ 0.08188782632350922,
2147
+ 0.39152956008911133,
2148
+ 0.2501949965953827,
2149
+ 0.1392025649547577,
2150
+ 0.33048632740974426,
2151
+ 0.1271236538887024,
2152
+ 0.29568424820899963,
2153
+ 0.11166119575500488,
2154
+ 0.28577253222465515,
2155
+ 0.20214661955833435,
2156
+ 0.00438786530867219,
2157
+ 0.0016851737163960934,
2158
+ 8.534547805786133,
2159
+ 0.08704172074794769
2160
+ ],
2161
+ "norm_high_t": [
2162
+ 0.9552221894264221,
2163
+ 2.923011302947998,
2164
+ 2.3088951110839844,
2165
+ 3.3186538219451904,
2166
+ 5.08151388168335,
2167
+ 2.726167678833008,
2168
+ 1.75440514087677,
2169
+ 3.37270188331604,
2170
+ 0.6986761093139648,
2171
+ 0.5467635989189148,
2172
+ 2.235100269317627,
2173
+ 0.4686378538608551,
2174
+ 0.45633357763290405,
2175
+ 2.896705389022827,
2176
+ 0.8851325511932373,
2177
+ 6.972827434539795,
2178
+ 2.036700487136841,
2179
+ 0.20503078401088715,
2180
+ 1.1935913562774658,
2181
+ 0.7886370420455933
2182
+ ],
2183
+ "lowhigh_cos": [
2184
+ 0.1374405026435852,
2185
+ 0.044124118983745575,
2186
+ 0.252091646194458,
2187
+ 0.2623433470726013,
2188
+ 0.2519252598285675,
2189
+ -0.0307513065636158,
2190
+ 0.08866468071937561,
2191
+ 0.3848387598991394,
2192
+ 0.04764184355735779,
2193
+ 0.15670593082904816,
2194
+ -0.11194318532943726,
2195
+ -0.3121059238910675,
2196
+ 0.4413709342479706,
2197
+ 0.36320483684539795,
2198
+ 0.10349930822849274,
2199
+ 0.28844499588012695,
2200
+ 0.475638747215271,
2201
+ -0.08917384594678879,
2202
+ 0.4083694517612457,
2203
+ 0.20888718962669373
2204
+ ]
2205
+ },
2206
+ "synthetic": {
2207
+ "iter": [
2208
+ 10,
2209
+ 20,
2210
+ 30,
2211
+ 40,
2212
+ 50,
2213
+ 60,
2214
+ 70,
2215
+ 80,
2216
+ 90,
2217
+ 100,
2218
+ 110,
2219
+ 120,
2220
+ 130,
2221
+ 140,
2222
+ 150,
2223
+ 160,
2224
+ 170,
2225
+ 180,
2226
+ 190,
2227
+ 200,
2228
+ 210,
2229
+ 220,
2230
+ 230,
2231
+ 240,
2232
+ 250,
2233
+ 260,
2234
+ 270,
2235
+ 280,
2236
+ 290,
2237
+ 300,
2238
+ 310,
2239
+ 320,
2240
+ 330,
2241
+ 340,
2242
+ 350,
2243
+ 360,
2244
+ 370,
2245
+ 380,
2246
+ 390,
2247
+ 400,
2248
+ 410,
2249
+ 420,
2250
+ 430,
2251
+ 440,
2252
+ 450,
2253
+ 460,
2254
+ 470,
2255
+ 480,
2256
+ 490,
2257
+ 500
2258
+ ],
2259
+ "loss": [
2260
+ 0.7531764507293701,
2261
+ 0.18765906989574432,
2262
+ 0.020723845809698105,
2263
+ 0.02801017463207245,
2264
+ 0.0033709718845784664,
2265
+ 0.20654648542404175,
2266
+ 0.08307354152202606,
2267
+ 0.01641359180212021,
2268
+ 0.00023058410442899913,
2269
+ 0.006693837232887745,
2270
+ 0.0006810599006712437,
2271
+ 0.025508670136332512,
2272
+ 0.0015238457126542926,
2273
+ 0.05820140987634659,
2274
+ 0.0008688275702297688,
2275
+ 0.00020045570272486657,
2276
+ 0.00022424766211770475,
2277
+ 3.519005258567631e-05,
2278
+ 0.01258005015552044,
2279
+ 3.128516982542351e-05,
2280
+ 0.0009578695753589272,
2281
+ 0.0010310658253729343,
2282
+ 0.03165672719478607,
2283
+ 1.3986779777042102e-05,
2284
+ 0.0014118634862825274,
2285
+ 0.002570120384916663,
2286
+ 0.0007432415150105953,
2287
+ 1.3963205674372148e-05,
2288
+ 9.562265768181533e-05,
2289
+ 8.88048016349785e-05,
2290
+ 0.0001328531070612371,
2291
+ 0.00014389181160368025,
2292
+ 0.2233337163925171,
2293
+ 9.16919598239474e-05,
2294
+ 0.06965351104736328,
2295
+ 0.0024555602576583624,
2296
+ 0.00010386335634393618,
2297
+ 0.03198186308145523,
2298
+ 0.00020677264546975493,
2299
+ 0.01207452081143856,
2300
+ 0.00016542139928787947,
2301
+ 1.0498906704015099e-05,
2302
+ 2.4054688196883944e-07,
2303
+ 1.8041833754978143e-05,
2304
+ 0.0001326937781414017,
2305
+ 4.153811460128054e-05,
2306
+ 0.0006373185897246003,
2307
+ 0.059840451925992966,
2308
+ 0.03510911762714386,
2309
+ 1.009482002700679e-05
2310
+ ],
2311
+ "mean_return": [],
2312
+ "win_rate_episode": [],
2313
+ "id_winrate": [
2314
+ 0.15000000000000002,
2315
+ 0.15,
2316
+ 0.025,
2317
+ 0.05,
2318
+ 0.0
2319
+ ],
2320
+ "id_winrate_iter": [
2321
+ 100,
2322
+ 200,
2323
+ 300,
2324
+ 400,
2325
+ 500
2326
+ ],
2327
+ "ood_winrate": [],
2328
+ "ood_winrate_iter": [],
2329
+ "grad_align_iter": [],
2330
+ "grad_align": [],
2331
+ "rl_grad_norm": [],
2332
+ "bc_grad_norm": [],
2333
+ "repr_drift_iter": [],
2334
+ "repr_drift": [],
2335
+ "t_analysis_iter": [],
2336
+ "norm_low_t": [],
2337
+ "norm_high_t": [],
2338
+ "lowhigh_cos": []
2339
+ }
2340
+ }
2341
+ }