File size: 32,217 Bytes
9206b9f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
2025-07-09 10:02:50,441 - INFO - Training with parameters:
2025-07-09 10:02:50,442 - INFO -   Text model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2
2025-07-09 10:02:50,442 - INFO -   Audio model: facebook/w2v-bert-2.0
2025-07-09 10:02:50,442 - INFO -   Freeze encoders: partial
2025-07-09 10:02:50,442 - INFO -   Text layers to unfreeze: 3
2025-07-09 10:02:50,442 - INFO -   Audio layers to unfreeze: 3
2025-07-09 10:02:50,442 - INFO -   Use cross-modal attention: False
2025-07-09 10:02:50,442 - INFO -   Use attentive pooling: False
2025-07-09 10:02:50,442 - INFO -   Use word-level alignment: False
2025-07-09 10:02:50,442 - INFO -   Batch size: 48
2025-07-09 10:02:50,442 - INFO -   Gradient accumulation steps: 15
2025-07-09 10:02:50,442 - INFO -   Effective batch size: 720
2025-07-09 10:02:50,442 - INFO -   Mixed precision training: False
2025-07-09 10:02:50,442 - INFO -   Learning rate: 0.0008
2025-07-09 10:02:50,442 - INFO -   Temperature: 0.1
2025-07-09 10:02:50,442 - INFO -   Projection dimension: 768
2025-07-09 10:02:50,442 - INFO -   Training samples: 21968
2025-07-09 10:02:50,442 - INFO -   Validation samples: 9464
2025-07-09 10:02:50,442 - INFO -   Test samples: 9467
2025-07-09 10:02:50,442 - INFO -   Max audio length: 480000 samples (30.00 seconds at 16kHz)
2025-07-09 10:02:50,442 - INFO - Loading tokenizer and feature extractor...
2025-07-09 10:02:51,406 - INFO - Feature extractor output keys: ['input_features', 'attention_mask']
2025-07-09 10:02:51,406 - INFO - Creating datasets...
2025-07-09 10:02:51,406 - INFO - Feature extractor output keys: ['input_features', 'attention_mask']
2025-07-09 10:02:51,407 - INFO - Feature extractor output keys: ['input_features', 'attention_mask']
2025-07-09 10:02:51,407 - INFO - Feature extractor output keys: ['input_features', 'attention_mask']
2025-07-09 10:02:51,407 - INFO - Creating data loaders...
2025-07-09 10:02:51,407 - INFO - Checking a sample batch...
2025-07-09 10:05:56,755 - INFO - Training with parameters:
2025-07-09 10:05:56,755 - INFO -   Text model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2
2025-07-09 10:05:56,755 - INFO -   Audio model: facebook/w2v-bert-2.0
2025-07-09 10:05:56,755 - INFO -   Freeze encoders: partial
2025-07-09 10:05:56,755 - INFO -   Text layers to unfreeze: 3
2025-07-09 10:05:56,755 - INFO -   Audio layers to unfreeze: 3
2025-07-09 10:05:56,755 - INFO -   Use cross-modal attention: False
2025-07-09 10:05:56,755 - INFO -   Use attentive pooling: False
2025-07-09 10:05:56,755 - INFO -   Use word-level alignment: False
2025-07-09 10:05:56,755 - INFO -   Batch size: 48
2025-07-09 10:05:56,755 - INFO -   Gradient accumulation steps: 15
2025-07-09 10:05:56,755 - INFO -   Effective batch size: 720
2025-07-09 10:05:56,755 - INFO -   Mixed precision training: False
2025-07-09 10:05:56,755 - INFO -   Learning rate: 0.0008
2025-07-09 10:05:56,755 - INFO -   Temperature: 0.1
2025-07-09 10:05:56,755 - INFO -   Projection dimension: 768
2025-07-09 10:05:56,755 - INFO -   Training samples: 21968
2025-07-09 10:05:56,755 - INFO -   Validation samples: 9464
2025-07-09 10:05:56,755 - INFO -   Test samples: 9467
2025-07-09 10:05:56,755 - INFO -   Max audio length: 480000 samples (30.00 seconds at 16kHz)
2025-07-09 10:05:56,755 - INFO - Loading tokenizer and feature extractor...
2025-07-09 10:05:57,689 - INFO - Feature extractor output keys: ['input_features', 'attention_mask']
2025-07-09 10:05:57,689 - INFO - Creating datasets...
2025-07-09 10:05:57,689 - INFO - Feature extractor output keys: ['input_features', 'attention_mask']
2025-07-09 10:05:57,690 - INFO - Feature extractor output keys: ['input_features', 'attention_mask']
2025-07-09 10:05:57,690 - INFO - Feature extractor output keys: ['input_features', 'attention_mask']
2025-07-09 10:05:57,690 - INFO - Creating data loaders...
2025-07-09 10:05:57,690 - INFO - Checking a sample batch...
2025-07-09 10:06:15,500 - INFO -   input_ids_pos: torch.Size([48, 128])
2025-07-09 10:06:15,500 - INFO -   attention_mask_pos: torch.Size([48, 128])
2025-07-09 10:06:15,500 - INFO -   input_ids_neg: torch.Size([48, 128])
2025-07-09 10:06:15,500 - INFO -   attention_mask_neg: torch.Size([48, 128])
2025-07-09 10:06:15,500 - INFO -   input_values: torch.Size([48, 473, 160])
2025-07-09 10:06:15,500 - INFO -   attention_mask_audio: torch.Size([48, 473])
2025-07-09 10:06:15,500 - INFO -   is_corrupted: torch.Size([48])
2025-07-09 10:06:15,500 - INFO - Initializing model...
2025-07-09 10:06:16,250 - INFO - Text encoder hidden dim: 768
2025-07-09 10:06:16,250 - INFO - Audio encoder hidden dim: 1024
2025-07-09 10:06:16,250 - INFO - Partial freezing: unfreezing last 3 text layers and 3 audio layers
2025-07-09 10:06:16,250 - INFO - Unfreezing text encoder layer 9
2025-07-09 10:06:16,250 - INFO - Unfreezing text encoder layer 10
2025-07-09 10:06:16,250 - INFO - Unfreezing text encoder layer 11
2025-07-09 10:06:16,251 - INFO - Unfreezing audio encoder layer 21
2025-07-09 10:06:16,251 - INFO - Unfreezing audio encoder layer 22
2025-07-09 10:06:16,251 - INFO - Unfreezing audio encoder layer 23
2025-07-09 10:06:16,281 - INFO - Model initialized with 292,079,360 trainable parameters out of 863,656,256 total
2025-07-09 10:06:17,120 - INFO - Using discriminative learning rates: encoder_lr=4e-05, main_lr=0.0008
2025-07-09 10:06:17,120 - INFO - Encoder parameters: 156, Non-encoder parameters: 12
2025-07-09 10:06:17,120 - INFO - Scheduler setup:
2025-07-09 10:06:17,121 - INFO -   Batches per epoch: 457
2025-07-09 10:06:17,121 - INFO -   Accumulation steps: 15
2025-07-09 10:06:17,121 - INFO -   Optimizer steps per epoch: 31
2025-07-09 10:06:17,121 - INFO -   Total optimizer steps: 930
2025-07-09 10:06:17,121 - INFO -   Warmup steps: 1000
2025-07-09 10:06:17,121 - INFO - Validating gradient accumulation setup...
2025-07-09 10:06:17,121 - INFO - Validating gradient accumulation with 15 steps...
2025-07-09 10:06:35,061 - WARNING - Not enough test batches (10) for accumulation_steps (15)
2025-07-09 10:06:35,061 - INFO - Starting training for 30 epochs
2025-07-09 10:18:51,576 - INFO - Epoch 1: Total optimizer steps: 31
2025-07-09 10:22:08,662 - INFO - Validation metrics:
2025-07-09 10:22:08,663 - INFO -   Loss: 0.5501
2025-07-09 10:22:08,663 - INFO -   Average similarity: 0.0474
2025-07-09 10:22:08,663 - INFO -   Median similarity: 0.0319
2025-07-09 10:22:08,663 - INFO -   Clean sample similarity: 0.0474
2025-07-09 10:22:08,663 - INFO -   Corrupted sample similarity: 0.0339
2025-07-09 10:22:08,663 - INFO -   Similarity gap (clean - corrupt): 0.0134
2025-07-09 10:22:08,790 - INFO - Epoch 1/30 - Train Loss: 0.6551, Val Loss: 0.5501, Clean Sim: 0.0474, Corrupt Sim: 0.0339, Gap: 0.0134, Time: 933.73s
2025-07-09 10:22:08,790 - INFO - New best validation loss: 0.5501
2025-07-09 10:22:15,090 - INFO - New best similarity gap: 0.0134
2025-07-09 10:34:33,415 - INFO - Epoch 2: Total optimizer steps: 31
2025-07-09 10:37:49,380 - INFO - Validation metrics:
2025-07-09 10:37:49,380 - INFO -   Loss: 0.3848
2025-07-09 10:37:49,380 - INFO -   Average similarity: 0.3295
2025-07-09 10:37:49,380 - INFO -   Median similarity: 0.1824
2025-07-09 10:37:49,380 - INFO -   Clean sample similarity: 0.3295
2025-07-09 10:37:49,380 - INFO -   Corrupted sample similarity: 0.1992
2025-07-09 10:37:49,380 - INFO -   Similarity gap (clean - corrupt): 0.1303
2025-07-09 10:37:49,528 - INFO - Epoch 2/30 - Train Loss: 0.5121, Val Loss: 0.3848, Clean Sim: 0.3295, Corrupt Sim: 0.1992, Gap: 0.1303, Time: 928.07s
2025-07-09 10:37:49,528 - INFO - New best validation loss: 0.3848
2025-07-09 10:37:56,439 - INFO - New best similarity gap: 0.1303
2025-07-09 10:40:47,309 - INFO - Epoch 2 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 10:52:54,869 - INFO - Epoch 3: Total optimizer steps: 31
2025-07-09 10:56:10,304 - INFO - Validation metrics:
2025-07-09 10:56:10,304 - INFO -   Loss: 0.3347
2025-07-09 10:56:10,304 - INFO -   Average similarity: 0.4569
2025-07-09 10:56:10,304 - INFO -   Median similarity: 0.3819
2025-07-09 10:56:10,304 - INFO -   Clean sample similarity: 0.4569
2025-07-09 10:56:10,304 - INFO -   Corrupted sample similarity: 0.2749
2025-07-09 10:56:10,304 - INFO -   Similarity gap (clean - corrupt): 0.1820
2025-07-09 10:56:10,443 - INFO - Epoch 3/30 - Train Loss: 0.4332, Val Loss: 0.3347, Clean Sim: 0.4569, Corrupt Sim: 0.2749, Gap: 0.1820, Time: 923.13s
2025-07-09 10:56:10,443 - INFO - New best validation loss: 0.3347
2025-07-09 10:56:17,152 - INFO - New best similarity gap: 0.1820
2025-07-09 11:08:36,466 - INFO - Epoch 4: Total optimizer steps: 31
2025-07-09 11:11:52,118 - INFO - Validation metrics:
2025-07-09 11:11:52,118 - INFO -   Loss: 0.3008
2025-07-09 11:11:52,118 - INFO -   Average similarity: 0.4935
2025-07-09 11:11:52,118 - INFO -   Median similarity: 0.4840
2025-07-09 11:11:52,118 - INFO -   Clean sample similarity: 0.4935
2025-07-09 11:11:52,118 - INFO -   Corrupted sample similarity: 0.2809
2025-07-09 11:11:52,118 - INFO -   Similarity gap (clean - corrupt): 0.2126
2025-07-09 11:11:52,234 - INFO - Epoch 4/30 - Train Loss: 0.3885, Val Loss: 0.3008, Clean Sim: 0.4935, Corrupt Sim: 0.2809, Gap: 0.2126, Time: 927.71s
2025-07-09 11:11:52,234 - INFO - New best validation loss: 0.3008
2025-07-09 11:11:58,985 - INFO - New best similarity gap: 0.2126
2025-07-09 11:14:50,013 - INFO - Epoch 4 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 11:26:58,690 - INFO - Epoch 5: Total optimizer steps: 31
2025-07-09 11:30:15,510 - INFO - Validation metrics:
2025-07-09 11:30:15,510 - INFO -   Loss: 0.2910
2025-07-09 11:30:15,510 - INFO -   Average similarity: 0.5942
2025-07-09 11:30:15,510 - INFO -   Median similarity: 0.8150
2025-07-09 11:30:15,510 - INFO -   Clean sample similarity: 0.5942
2025-07-09 11:30:15,510 - INFO -   Corrupted sample similarity: 0.3597
2025-07-09 11:30:15,510 - INFO -   Similarity gap (clean - corrupt): 0.2344
2025-07-09 11:30:15,637 - INFO - Epoch 5/30 - Train Loss: 0.3716, Val Loss: 0.2910, Clean Sim: 0.5942, Corrupt Sim: 0.3597, Gap: 0.2344, Time: 925.62s
2025-07-09 11:30:15,637 - INFO - New best validation loss: 0.2910
2025-07-09 11:30:22,374 - INFO - New best similarity gap: 0.2344
2025-07-09 11:42:45,432 - INFO - Epoch 6: Total optimizer steps: 31
2025-07-09 11:46:01,110 - INFO - Validation metrics:
2025-07-09 11:46:01,111 - INFO -   Loss: 0.2737
2025-07-09 11:46:01,111 - INFO -   Average similarity: 0.5773
2025-07-09 11:46:01,111 - INFO -   Median similarity: 0.7764
2025-07-09 11:46:01,111 - INFO -   Clean sample similarity: 0.5773
2025-07-09 11:46:01,111 - INFO -   Corrupted sample similarity: 0.3289
2025-07-09 11:46:01,111 - INFO -   Similarity gap (clean - corrupt): 0.2484
2025-07-09 11:46:01,242 - INFO - Epoch 6/30 - Train Loss: 0.3509, Val Loss: 0.2737, Clean Sim: 0.5773, Corrupt Sim: 0.3289, Gap: 0.2484, Time: 931.59s
2025-07-09 11:46:01,242 - INFO - New best validation loss: 0.2737
2025-07-09 11:46:08,171 - INFO - New best similarity gap: 0.2484
2025-07-09 11:48:58,321 - INFO - Epoch 6 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 12:01:16,208 - INFO - Epoch 7: Total optimizer steps: 31
2025-07-09 12:04:31,657 - INFO - Validation metrics:
2025-07-09 12:04:31,657 - INFO -   Loss: 0.2616
2025-07-09 12:04:31,657 - INFO -   Average similarity: 0.6094
2025-07-09 12:04:31,657 - INFO -   Median similarity: 0.8658
2025-07-09 12:04:31,657 - INFO -   Clean sample similarity: 0.6094
2025-07-09 12:04:31,658 - INFO -   Corrupted sample similarity: 0.3416
2025-07-09 12:04:31,658 - INFO -   Similarity gap (clean - corrupt): 0.2678
2025-07-09 12:04:31,765 - INFO - Epoch 7/30 - Train Loss: 0.3341, Val Loss: 0.2616, Clean Sim: 0.6094, Corrupt Sim: 0.3416, Gap: 0.2678, Time: 933.44s
2025-07-09 12:04:31,765 - INFO - New best validation loss: 0.2616
2025-07-09 12:04:38,550 - INFO - New best similarity gap: 0.2678
2025-07-09 12:17:00,511 - INFO - Epoch 8: Total optimizer steps: 31
2025-07-09 12:20:16,262 - INFO - Validation metrics:
2025-07-09 12:20:16,262 - INFO -   Loss: 0.2580
2025-07-09 12:20:16,262 - INFO -   Average similarity: 0.6054
2025-07-09 12:20:16,262 - INFO -   Median similarity: 0.8577
2025-07-09 12:20:16,262 - INFO -   Clean sample similarity: 0.6054
2025-07-09 12:20:16,262 - INFO -   Corrupted sample similarity: 0.3324
2025-07-09 12:20:16,262 - INFO -   Similarity gap (clean - corrupt): 0.2730
2025-07-09 12:20:16,374 - INFO - Epoch 8/30 - Train Loss: 0.3235, Val Loss: 0.2580, Clean Sim: 0.6054, Corrupt Sim: 0.3324, Gap: 0.2730, Time: 930.51s
2025-07-09 12:20:16,374 - INFO - New best validation loss: 0.2580
2025-07-09 12:20:23,227 - INFO - New best similarity gap: 0.2730
2025-07-09 12:23:13,199 - INFO - Epoch 8 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 12:35:29,937 - INFO - Epoch 9: Total optimizer steps: 31
2025-07-09 12:38:45,452 - INFO - Validation metrics:
2025-07-09 12:38:45,452 - INFO -   Loss: 0.2467
2025-07-09 12:38:45,452 - INFO -   Average similarity: 0.6282
2025-07-09 12:38:45,452 - INFO -   Median similarity: 0.9019
2025-07-09 12:38:45,452 - INFO -   Clean sample similarity: 0.6282
2025-07-09 12:38:45,452 - INFO -   Corrupted sample similarity: 0.3381
2025-07-09 12:38:45,452 - INFO -   Similarity gap (clean - corrupt): 0.2901
2025-07-09 12:38:45,580 - INFO - Epoch 9/30 - Train Loss: 0.3127, Val Loss: 0.2467, Clean Sim: 0.6282, Corrupt Sim: 0.3381, Gap: 0.2901, Time: 932.38s
2025-07-09 12:38:45,580 - INFO - New best validation loss: 0.2467
2025-07-09 12:38:52,326 - INFO - New best similarity gap: 0.2901
2025-07-09 12:51:11,687 - INFO - Epoch 10: Total optimizer steps: 31
2025-07-09 12:54:27,759 - INFO - Validation metrics:
2025-07-09 12:54:27,760 - INFO -   Loss: 0.2337
2025-07-09 12:54:27,760 - INFO -   Average similarity: 0.6409
2025-07-09 12:54:27,760 - INFO -   Median similarity: 0.9230
2025-07-09 12:54:27,760 - INFO -   Clean sample similarity: 0.6409
2025-07-09 12:54:27,760 - INFO -   Corrupted sample similarity: 0.3338
2025-07-09 12:54:27,760 - INFO -   Similarity gap (clean - corrupt): 0.3071
2025-07-09 12:54:27,863 - INFO - Epoch 10/30 - Train Loss: 0.3025, Val Loss: 0.2337, Clean Sim: 0.6409, Corrupt Sim: 0.3338, Gap: 0.3071, Time: 928.26s
2025-07-09 12:54:27,863 - INFO - New best validation loss: 0.2337
2025-07-09 12:54:34,605 - INFO - New best similarity gap: 0.3071
2025-07-09 12:57:23,489 - INFO - Epoch 10 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 13:09:34,426 - INFO - Epoch 11: Total optimizer steps: 31
2025-07-09 13:12:51,270 - INFO - Validation metrics:
2025-07-09 13:12:51,270 - INFO -   Loss: 0.2312
2025-07-09 13:12:51,270 - INFO -   Average similarity: 0.6329
2025-07-09 13:12:51,270 - INFO -   Median similarity: 0.9068
2025-07-09 13:12:51,270 - INFO -   Clean sample similarity: 0.6329
2025-07-09 13:12:51,270 - INFO -   Corrupted sample similarity: 0.3190
2025-07-09 13:12:51,270 - INFO -   Similarity gap (clean - corrupt): 0.3138
2025-07-09 13:12:51,384 - INFO - Epoch 11/30 - Train Loss: 0.2904, Val Loss: 0.2312, Clean Sim: 0.6329, Corrupt Sim: 0.3190, Gap: 0.3138, Time: 927.89s
2025-07-09 13:12:51,384 - INFO - New best validation loss: 0.2312
2025-07-09 13:12:58,336 - INFO - New best similarity gap: 0.3138
2025-07-09 13:25:17,357 - INFO - Epoch 12: Total optimizer steps: 31
2025-07-09 13:28:35,622 - INFO - Validation metrics:
2025-07-09 13:28:35,622 - INFO -   Loss: 0.2223
2025-07-09 13:28:35,622 - INFO -   Average similarity: 0.6732
2025-07-09 13:28:35,622 - INFO -   Median similarity: 0.9485
2025-07-09 13:28:35,622 - INFO -   Clean sample similarity: 0.6732
2025-07-09 13:28:35,622 - INFO -   Corrupted sample similarity: 0.3435
2025-07-09 13:28:35,622 - INFO -   Similarity gap (clean - corrupt): 0.3297
2025-07-09 13:28:35,738 - INFO - Epoch 12/30 - Train Loss: 0.2779, Val Loss: 0.2223, Clean Sim: 0.6732, Corrupt Sim: 0.3435, Gap: 0.3297, Time: 930.02s
2025-07-09 13:28:35,738 - INFO - New best validation loss: 0.2223
2025-07-09 13:28:42,677 - INFO - New best similarity gap: 0.3297
2025-07-09 13:31:33,532 - INFO - Epoch 12 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 13:43:51,607 - INFO - Epoch 13: Total optimizer steps: 31
2025-07-09 13:47:08,884 - INFO - Validation metrics:
2025-07-09 13:47:08,885 - INFO -   Loss: 0.2143
2025-07-09 13:47:08,885 - INFO -   Average similarity: 0.6180
2025-07-09 13:47:08,885 - INFO -   Median similarity: 0.8872
2025-07-09 13:47:08,885 - INFO -   Clean sample similarity: 0.6180
2025-07-09 13:47:08,885 - INFO -   Corrupted sample similarity: 0.2805
2025-07-09 13:47:08,885 - INFO -   Similarity gap (clean - corrupt): 0.3375
2025-07-09 13:47:09,002 - INFO - Epoch 13/30 - Train Loss: 0.2746, Val Loss: 0.2143, Clean Sim: 0.6180, Corrupt Sim: 0.2805, Gap: 0.3375, Time: 935.47s
2025-07-09 13:47:09,003 - INFO - New best validation loss: 0.2143
2025-07-09 13:47:16,005 - INFO - New best similarity gap: 0.3375
2025-07-09 13:59:44,255 - INFO - Epoch 14: Total optimizer steps: 31
2025-07-09 14:03:01,055 - INFO - Validation metrics:
2025-07-09 14:03:01,055 - INFO -   Loss: 0.2056
2025-07-09 14:03:01,055 - INFO -   Average similarity: 0.6810
2025-07-09 14:03:01,055 - INFO -   Median similarity: 0.9569
2025-07-09 14:03:01,055 - INFO -   Clean sample similarity: 0.6810
2025-07-09 14:03:01,055 - INFO -   Corrupted sample similarity: 0.3251
2025-07-09 14:03:01,055 - INFO -   Similarity gap (clean - corrupt): 0.3559
2025-07-09 14:03:01,156 - INFO - Epoch 14/30 - Train Loss: 0.2641, Val Loss: 0.2056, Clean Sim: 0.6810, Corrupt Sim: 0.3251, Gap: 0.3559, Time: 937.64s
2025-07-09 14:03:01,156 - INFO - New best validation loss: 0.2056
2025-07-09 14:03:07,933 - INFO - New best similarity gap: 0.3559
2025-07-09 14:05:58,018 - INFO - Epoch 14 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 14:18:08,049 - INFO - Epoch 15: Total optimizer steps: 31
2025-07-09 14:21:24,577 - INFO - Validation metrics:
2025-07-09 14:21:24,577 - INFO -   Loss: 0.2027
2025-07-09 14:21:24,577 - INFO -   Average similarity: 0.6446
2025-07-09 14:21:24,577 - INFO -   Median similarity: 0.9029
2025-07-09 14:21:24,577 - INFO -   Clean sample similarity: 0.6446
2025-07-09 14:21:24,577 - INFO -   Corrupted sample similarity: 0.2914
2025-07-09 14:21:24,578 - INFO -   Similarity gap (clean - corrupt): 0.3532
2025-07-09 14:21:24,706 - INFO - Epoch 15/30 - Train Loss: 0.2543, Val Loss: 0.2027, Clean Sim: 0.6446, Corrupt Sim: 0.2914, Gap: 0.3532, Time: 926.69s
2025-07-09 14:21:24,706 - INFO - New best validation loss: 0.2027
2025-07-09 14:33:59,853 - INFO - Epoch 16: Total optimizer steps: 31
2025-07-09 14:37:18,100 - INFO - Validation metrics:
2025-07-09 14:37:18,100 - INFO -   Loss: 0.1950
2025-07-09 14:37:18,100 - INFO -   Average similarity: 0.7201
2025-07-09 14:37:18,100 - INFO -   Median similarity: 0.9757
2025-07-09 14:37:18,100 - INFO -   Clean sample similarity: 0.7201
2025-07-09 14:37:18,100 - INFO -   Corrupted sample similarity: 0.3419
2025-07-09 14:37:18,100 - INFO -   Similarity gap (clean - corrupt): 0.3782
2025-07-09 14:37:18,222 - INFO - Epoch 16/30 - Train Loss: 0.2506, Val Loss: 0.1950, Clean Sim: 0.7201, Corrupt Sim: 0.3419, Gap: 0.3782, Time: 941.32s
2025-07-09 14:37:18,222 - INFO - New best validation loss: 0.1950
2025-07-09 14:37:24,624 - INFO - New best similarity gap: 0.3782
2025-07-09 14:40:15,594 - INFO - Epoch 16 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 14:52:44,776 - INFO - Epoch 17: Total optimizer steps: 31
2025-07-09 14:56:02,178 - INFO - Validation metrics:
2025-07-09 14:56:02,178 - INFO -   Loss: 0.1946
2025-07-09 14:56:02,178 - INFO -   Average similarity: 0.6943
2025-07-09 14:56:02,178 - INFO -   Median similarity: 0.9603
2025-07-09 14:56:02,178 - INFO -   Clean sample similarity: 0.6943
2025-07-09 14:56:02,178 - INFO -   Corrupted sample similarity: 0.3080
2025-07-09 14:56:02,178 - INFO -   Similarity gap (clean - corrupt): 0.3864
2025-07-09 14:56:02,281 - INFO - Epoch 17/30 - Train Loss: 0.2467, Val Loss: 0.1946, Clean Sim: 0.6943, Corrupt Sim: 0.3080, Gap: 0.3864, Time: 946.69s
2025-07-09 14:56:02,282 - INFO - New best validation loss: 0.1946
2025-07-09 14:56:08,783 - INFO - New best similarity gap: 0.3864
2025-07-09 15:08:42,755 - INFO - Epoch 18: Total optimizer steps: 31
2025-07-09 15:12:01,042 - INFO - Validation metrics:
2025-07-09 15:12:01,043 - INFO -   Loss: 0.1850
2025-07-09 15:12:01,043 - INFO -   Average similarity: 0.7162
2025-07-09 15:12:01,043 - INFO -   Median similarity: 0.9751
2025-07-09 15:12:01,043 - INFO -   Clean sample similarity: 0.7162
2025-07-09 15:12:01,043 - INFO -   Corrupted sample similarity: 0.3205
2025-07-09 15:12:01,043 - INFO -   Similarity gap (clean - corrupt): 0.3957
2025-07-09 15:12:01,157 - INFO - Epoch 18/30 - Train Loss: 0.2353, Val Loss: 0.1850, Clean Sim: 0.7162, Corrupt Sim: 0.3205, Gap: 0.3957, Time: 945.76s
2025-07-09 15:12:01,157 - INFO - New best validation loss: 0.1850
2025-07-09 15:12:07,833 - INFO - New best similarity gap: 0.3957
2025-07-09 15:14:57,923 - INFO - Epoch 18 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 15:27:22,903 - INFO - Epoch 19: Total optimizer steps: 31
2025-07-09 15:30:40,855 - INFO - Validation metrics:
2025-07-09 15:30:40,855 - INFO -   Loss: 0.1868
2025-07-09 15:30:40,855 - INFO -   Average similarity: 0.7511
2025-07-09 15:30:40,855 - INFO -   Median similarity: 0.9870
2025-07-09 15:30:40,855 - INFO -   Clean sample similarity: 0.7511
2025-07-09 15:30:40,855 - INFO -   Corrupted sample similarity: 0.3606
2025-07-09 15:30:40,855 - INFO -   Similarity gap (clean - corrupt): 0.3905
2025-07-09 15:30:40,988 - INFO - Epoch 19/30 - Train Loss: 0.2352, Val Loss: 0.1868, Clean Sim: 0.7511, Corrupt Sim: 0.3606, Gap: 0.3905, Time: 943.06s
2025-07-09 15:43:07,937 - INFO - Epoch 20: Total optimizer steps: 31
2025-07-09 15:46:25,284 - INFO - Validation metrics:
2025-07-09 15:46:25,285 - INFO -   Loss: 0.1752
2025-07-09 15:46:25,285 - INFO -   Average similarity: 0.6895
2025-07-09 15:46:25,285 - INFO -   Median similarity: 0.9676
2025-07-09 15:46:25,285 - INFO -   Clean sample similarity: 0.6895
2025-07-09 15:46:25,285 - INFO -   Corrupted sample similarity: 0.2810
2025-07-09 15:46:25,285 - INFO -   Similarity gap (clean - corrupt): 0.4085
2025-07-09 15:46:25,420 - INFO - Epoch 20/30 - Train Loss: 0.2317, Val Loss: 0.1752, Clean Sim: 0.6895, Corrupt Sim: 0.2810, Gap: 0.4085, Time: 944.43s
2025-07-09 15:46:25,420 - INFO - New best validation loss: 0.1752
2025-07-09 15:46:32,053 - INFO - New best similarity gap: 0.4085
2025-07-09 15:49:22,949 - INFO - Epoch 20 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 16:01:50,850 - INFO - Epoch 21: Total optimizer steps: 31
2025-07-09 16:05:08,533 - INFO - Validation metrics:
2025-07-09 16:05:08,533 - INFO -   Loss: 0.1761
2025-07-09 16:05:08,533 - INFO -   Average similarity: 0.6436
2025-07-09 16:05:08,533 - INFO -   Median similarity: 0.9055
2025-07-09 16:05:08,533 - INFO -   Clean sample similarity: 0.6436
2025-07-09 16:05:08,533 - INFO -   Corrupted sample similarity: 0.2365
2025-07-09 16:05:08,533 - INFO -   Similarity gap (clean - corrupt): 0.4070
2025-07-09 16:05:08,641 - INFO - Epoch 21/30 - Train Loss: 0.2267, Val Loss: 0.1761, Clean Sim: 0.6436, Corrupt Sim: 0.2365, Gap: 0.4070, Time: 945.69s
2025-07-09 16:17:36,939 - INFO - Epoch 22: Total optimizer steps: 31
2025-07-09 16:20:54,509 - INFO - Validation metrics:
2025-07-09 16:20:54,509 - INFO -   Loss: 0.1725
2025-07-09 16:20:54,509 - INFO -   Average similarity: 0.7292
2025-07-09 16:20:54,509 - INFO -   Median similarity: 0.9750
2025-07-09 16:20:54,509 - INFO -   Clean sample similarity: 0.7292
2025-07-09 16:20:54,509 - INFO -   Corrupted sample similarity: 0.3020
2025-07-09 16:20:54,509 - INFO -   Similarity gap (clean - corrupt): 0.4272
2025-07-09 16:20:54,647 - INFO - Epoch 22/30 - Train Loss: 0.2244, Val Loss: 0.1725, Clean Sim: 0.7292, Corrupt Sim: 0.3020, Gap: 0.4272, Time: 946.01s
2025-07-09 16:20:54,647 - INFO - New best validation loss: 0.1725
2025-07-09 16:21:01,162 - INFO - New best similarity gap: 0.4272
2025-07-09 16:23:50,529 - INFO - Epoch 22 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 16:36:19,562 - INFO - Epoch 23: Total optimizer steps: 31
2025-07-09 16:39:36,938 - INFO - Validation metrics:
2025-07-09 16:39:36,938 - INFO -   Loss: 0.1698
2025-07-09 16:39:36,938 - INFO -   Average similarity: 0.6585
2025-07-09 16:39:36,938 - INFO -   Median similarity: 0.9260
2025-07-09 16:39:36,938 - INFO -   Clean sample similarity: 0.6585
2025-07-09 16:39:36,938 - INFO -   Corrupted sample similarity: 0.2528
2025-07-09 16:39:36,938 - INFO -   Similarity gap (clean - corrupt): 0.4057
2025-07-09 16:39:37,057 - INFO - Epoch 23/30 - Train Loss: 0.2147, Val Loss: 0.1698, Clean Sim: 0.6585, Corrupt Sim: 0.2528, Gap: 0.4057, Time: 946.53s
2025-07-09 16:39:37,057 - INFO - New best validation loss: 0.1698
2025-07-09 16:52:02,629 - INFO - Epoch 24: Total optimizer steps: 31
2025-07-09 16:55:20,171 - INFO - Validation metrics:
2025-07-09 16:55:20,171 - INFO -   Loss: 0.1622
2025-07-09 16:55:20,171 - INFO -   Average similarity: 0.6216
2025-07-09 16:55:20,171 - INFO -   Median similarity: 0.8589
2025-07-09 16:55:20,171 - INFO -   Clean sample similarity: 0.6216
2025-07-09 16:55:20,171 - INFO -   Corrupted sample similarity: 0.2012
2025-07-09 16:55:20,171 - INFO -   Similarity gap (clean - corrupt): 0.4203
2025-07-09 16:55:20,303 - INFO - Epoch 24/30 - Train Loss: 0.2169, Val Loss: 0.1622, Clean Sim: 0.6216, Corrupt Sim: 0.2012, Gap: 0.4203, Time: 936.70s
2025-07-09 16:55:20,304 - INFO - New best validation loss: 0.1622
2025-07-09 16:58:12,057 - INFO - Epoch 24 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 17:10:36,292 - INFO - Epoch 25: Total optimizer steps: 31
2025-07-09 17:13:54,630 - INFO - Validation metrics:
2025-07-09 17:13:54,630 - INFO -   Loss: 0.1701
2025-07-09 17:13:54,630 - INFO -   Average similarity: 0.6491
2025-07-09 17:13:54,630 - INFO -   Median similarity: 0.8909
2025-07-09 17:13:54,630 - INFO -   Clean sample similarity: 0.6491
2025-07-09 17:13:54,630 - INFO -   Corrupted sample similarity: 0.2283
2025-07-09 17:13:54,630 - INFO -   Similarity gap (clean - corrupt): 0.4207
2025-07-09 17:13:54,735 - INFO - Epoch 25/30 - Train Loss: 0.2145, Val Loss: 0.1701, Clean Sim: 0.6491, Corrupt Sim: 0.2283, Gap: 0.4207, Time: 942.68s
2025-07-09 17:26:12,647 - INFO - Epoch 26: Total optimizer steps: 31
2025-07-09 17:29:28,828 - INFO - Validation metrics:
2025-07-09 17:29:28,828 - INFO -   Loss: 0.1657
2025-07-09 17:29:28,828 - INFO -   Average similarity: 0.7093
2025-07-09 17:29:28,828 - INFO -   Median similarity: 0.9734
2025-07-09 17:29:28,828 - INFO -   Clean sample similarity: 0.7093
2025-07-09 17:29:28,828 - INFO -   Corrupted sample similarity: 0.2811
2025-07-09 17:29:28,828 - INFO -   Similarity gap (clean - corrupt): 0.4282
2025-07-09 17:29:28,950 - INFO - Epoch 26/30 - Train Loss: 0.2103, Val Loss: 0.1657, Clean Sim: 0.7093, Corrupt Sim: 0.2811, Gap: 0.4282, Time: 934.21s
2025-07-09 17:29:28,950 - INFO - New best similarity gap: 0.4282
2025-07-09 17:32:19,140 - INFO - Epoch 26 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 17:44:44,050 - INFO - Epoch 27: Total optimizer steps: 31
2025-07-09 17:48:03,080 - INFO - Validation metrics:
2025-07-09 17:48:03,080 - INFO -   Loss: 0.1599
2025-07-09 17:48:03,080 - INFO -   Average similarity: 0.7226
2025-07-09 17:48:03,080 - INFO -   Median similarity: 0.9653
2025-07-09 17:48:03,080 - INFO -   Clean sample similarity: 0.7226
2025-07-09 17:48:03,080 - INFO -   Corrupted sample similarity: 0.2710
2025-07-09 17:48:03,080 - INFO -   Similarity gap (clean - corrupt): 0.4516
2025-07-09 17:48:03,205 - INFO - Epoch 27/30 - Train Loss: 0.2110, Val Loss: 0.1599, Clean Sim: 0.7226, Corrupt Sim: 0.2710, Gap: 0.4516, Time: 944.06s
2025-07-09 17:48:03,206 - INFO - New best validation loss: 0.1599
2025-07-09 17:48:09,874 - INFO - New best similarity gap: 0.4516
2025-07-09 18:00:42,666 - INFO - Epoch 28: Total optimizer steps: 31
2025-07-09 18:04:01,807 - INFO - Validation metrics:
2025-07-09 18:04:01,808 - INFO -   Loss: 0.1578
2025-07-09 18:04:01,808 - INFO -   Average similarity: 0.6821
2025-07-09 18:04:01,808 - INFO -   Median similarity: 0.9215
2025-07-09 18:04:01,808 - INFO -   Clean sample similarity: 0.6821
2025-07-09 18:04:01,808 - INFO -   Corrupted sample similarity: 0.2303
2025-07-09 18:04:01,808 - INFO -   Similarity gap (clean - corrupt): 0.4518
2025-07-09 18:04:01,924 - INFO - Epoch 28/30 - Train Loss: 0.2068, Val Loss: 0.1578, Clean Sim: 0.6821, Corrupt Sim: 0.2303, Gap: 0.4518, Time: 944.98s
2025-07-09 18:04:01,924 - INFO - New best validation loss: 0.1578
2025-07-09 18:04:08,704 - INFO - New best similarity gap: 0.4518
2025-07-09 18:06:59,355 - INFO - Epoch 28 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 18:19:22,621 - INFO - Epoch 29: Total optimizer steps: 31
2025-07-09 18:22:40,643 - INFO - Validation metrics:
2025-07-09 18:22:40,644 - INFO -   Loss: 0.1575
2025-07-09 18:22:40,644 - INFO -   Average similarity: 0.6681
2025-07-09 18:22:40,644 - INFO -   Median similarity: 0.9217
2025-07-09 18:22:40,644 - INFO -   Clean sample similarity: 0.6681
2025-07-09 18:22:40,644 - INFO -   Corrupted sample similarity: 0.2448
2025-07-09 18:22:40,644 - INFO -   Similarity gap (clean - corrupt): 0.4233
2025-07-09 18:22:40,759 - INFO - Epoch 29/30 - Train Loss: 0.2089, Val Loss: 0.1575, Clean Sim: 0.6681, Corrupt Sim: 0.2448, Gap: 0.4233, Time: 941.40s
2025-07-09 18:22:40,760 - INFO - New best validation loss: 0.1575
2025-07-09 18:35:06,440 - INFO - Epoch 30: Total optimizer steps: 31
2025-07-09 18:38:24,646 - INFO - Validation metrics:
2025-07-09 18:38:24,647 - INFO -   Loss: 0.1578
2025-07-09 18:38:24,647 - INFO -   Average similarity: 0.7543
2025-07-09 18:38:24,647 - INFO -   Median similarity: 0.9870
2025-07-09 18:38:24,647 - INFO -   Clean sample similarity: 0.7543
2025-07-09 18:38:24,647 - INFO -   Corrupted sample similarity: 0.3034
2025-07-09 18:38:24,647 - INFO -   Similarity gap (clean - corrupt): 0.4509
2025-07-09 18:38:24,749 - INFO - Epoch 30/30 - Train Loss: 0.2037, Val Loss: 0.1578, Clean Sim: 0.7543, Corrupt Sim: 0.3034, Gap: 0.4509, Time: 937.46s
2025-07-09 18:41:15,378 - INFO - Epoch 30 Validation Alignment: Pos=0.000, Neg=0.000, Gap=0.000
2025-07-09 18:41:15,378 - INFO - Training completed!
2025-07-09 18:41:21,296 - INFO - Evaluating best models on test set...
2025-07-09 18:41:25,008 - INFO - Loaded best loss model from epoch 29
2025-07-09 18:44:59,190 - INFO - Test (Best Loss) metrics:
2025-07-09 18:44:59,190 - INFO -   Loss: 0.1639
2025-07-09 18:44:59,190 - INFO -   Average similarity: 0.6754
2025-07-09 18:44:59,190 - INFO -   Median similarity: 0.9328
2025-07-09 18:44:59,190 - INFO -   Clean sample similarity: 0.6754
2025-07-09 18:44:59,190 - INFO -   Corrupted sample similarity: 0.2568
2025-07-09 18:44:59,190 - INFO -   Similarity gap (clean - corrupt): 0.4186
2025-07-09 18:48:07,330 - INFO - Loaded best gap model from epoch 28
2025-07-09 18:51:43,726 - INFO - Test (Best Gap) metrics:
2025-07-09 18:51:43,726 - INFO -   Loss: 0.1586
2025-07-09 18:51:43,726 - INFO -   Average similarity: 0.6904
2025-07-09 18:51:43,726 - INFO -   Median similarity: 0.9348
2025-07-09 18:51:43,726 - INFO -   Clean sample similarity: 0.6904
2025-07-09 18:51:43,726 - INFO -   Corrupted sample similarity: 0.2364
2025-07-09 18:51:43,726 - INFO -   Similarity gap (clean - corrupt): 0.4540
2025-07-09 18:54:45,139 - INFO - Evaluation completed!
2025-07-09 18:54:45,139 - INFO - Test results for best_loss_model:
2025-07-09 18:54:45,139 - INFO -   Loss: 0.1639
2025-07-09 18:54:45,139 - INFO -   Clean Sample Similarity: 0.6754
2025-07-09 18:54:45,139 - INFO -   Corrupted Sample Similarity: 0.2568
2025-07-09 18:54:45,139 - INFO -   Similarity Gap: 0.4186
2025-07-09 18:54:45,139 - INFO - Test results for best_gap_model:
2025-07-09 18:54:45,139 - INFO -   Loss: 0.1586
2025-07-09 18:54:45,139 - INFO -   Clean Sample Similarity: 0.6904
2025-07-09 18:54:45,139 - INFO -   Corrupted Sample Similarity: 0.2364
2025-07-09 18:54:45,139 - INFO -   Similarity Gap: 0.4540
2025-07-09 18:54:45,544 - INFO - All tasks completed!