2025-07-15 22:35:11,935 - INFO - Training with parameters: 2025-07-15 22:35:11,935 - INFO - Text model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2 2025-07-15 22:35:11,935 - INFO - Audio model: facebook/w2v-bert-2.0 2025-07-15 22:35:11,935 - INFO - Freeze encoders: partial 2025-07-15 22:35:11,935 - INFO - Text layers to unfreeze: 3 2025-07-15 22:35:11,935 - INFO - Audio layers to unfreeze: 3 2025-07-15 22:35:11,935 - INFO - Use cross-modal attention: False 2025-07-15 22:35:11,935 - INFO - Use attentive pooling: False 2025-07-15 22:35:11,935 - INFO - Use word-level alignment: True 2025-07-15 22:35:11,935 - INFO - Batch size: 48 2025-07-15 22:35:11,935 - INFO - Gradient accumulation steps: 15 2025-07-15 22:35:11,935 - INFO - Effective batch size: 720 2025-07-15 22:35:11,935 - INFO - Mixed precision training: False 2025-07-15 22:35:11,935 - INFO - Learning rate: 0.0008 2025-07-15 22:35:11,935 - INFO - Temperature: 0.1 2025-07-15 22:35:11,935 - INFO - Projection dimension: 768 2025-07-15 22:35:11,935 - INFO - Training samples: 34898 2025-07-15 22:35:11,935 - INFO - Validation samples: 11252 2025-07-15 22:35:11,935 - INFO - Test samples: 11266 2025-07-15 22:35:11,935 - INFO - Max audio length: 480000 samples (30.00 seconds at 16kHz) 2025-07-15 22:35:11,935 - INFO - Loading tokenizer and feature extractor... 2025-07-15 22:35:15,280 - INFO - Feature extractor output keys: ['input_features', 'attention_mask'] 2025-07-15 22:35:15,280 - INFO - Creating datasets... 2025-07-15 22:35:15,280 - INFO - Feature extractor output keys: ['input_features', 'attention_mask'] 2025-07-15 22:35:15,280 - INFO - Feature extractor output keys: ['input_features', 'attention_mask'] 2025-07-15 22:35:15,280 - INFO - Feature extractor output keys: ['input_features', 'attention_mask'] 2025-07-15 22:35:15,280 - INFO - Creating data loaders... 2025-07-15 22:35:15,281 - INFO - Checking a sample batch... 2025-07-15 22:35:26,877 - INFO - input_ids_pos: torch.Size([48, 128]) 2025-07-15 22:35:26,877 - INFO - attention_mask_pos: torch.Size([48, 128]) 2025-07-15 22:35:26,877 - INFO - input_ids_neg: torch.Size([48, 128]) 2025-07-15 22:35:26,877 - INFO - attention_mask_neg: torch.Size([48, 128]) 2025-07-15 22:35:26,877 - INFO - input_values: torch.Size([48, 383, 160]) 2025-07-15 22:35:26,877 - INFO - attention_mask_audio: torch.Size([48, 383]) 2025-07-15 22:35:26,877 - INFO - is_corrupted: torch.Size([48]) 2025-07-15 22:35:26,877 - INFO - correctness_scores: torch.Size([48]) 2025-07-15 22:35:26,877 - INFO - Initializing model... 2025-07-15 22:35:35,575 - INFO - Text encoder hidden dim: 768 2025-07-15 22:35:35,575 - INFO - Audio encoder hidden dim: 1024 2025-07-15 22:35:35,576 - INFO - Partial freezing: unfreezing last 3 text layers and 3 audio layers 2025-07-15 22:35:35,576 - INFO - Unfreezing text encoder layer 9 2025-07-15 22:35:35,576 - INFO - Unfreezing text encoder layer 10 2025-07-15 22:35:35,576 - INFO - Unfreezing text encoder layer 11 2025-07-15 22:35:35,576 - INFO - Unfreezing audio encoder layer 21 2025-07-15 22:35:35,576 - INFO - Unfreezing audio encoder layer 22 2025-07-15 22:35:35,576 - INFO - Unfreezing audio encoder layer 23 2025-07-15 22:35:35,642 - INFO - Model initialized with 308,221,186 trainable parameters out of 879,798,082 total 2025-07-15 22:35:35,985 - INFO - Using discriminative learning rates: encoder_lr=4e-05, main_lr=0.0008 2025-07-15 22:35:35,985 - INFO - Encoder parameters: 156, Non-encoder parameters: 38 2025-07-15 22:35:35,985 - INFO - Checking if loss parameters are in optimizer... 2025-07-15 22:35:35,985 - INFO - ✓ log_sigma2_align is in optimizer 2025-07-15 22:35:35,985 - INFO - Total parameters in optimizer: 194 2025-07-15 22:35:35,986 - INFO - Model parameters: 193 2025-07-15 22:35:35,986 - INFO - Loss parameters: 1 2025-07-15 22:35:35,986 - INFO - Scheduler setup: 2025-07-15 22:35:35,986 - INFO - Batches per epoch: 727 2025-07-15 22:35:35,986 - INFO - Accumulation steps: 15 2025-07-15 22:35:35,986 - INFO - Optimizer steps per epoch: 49 2025-07-15 22:35:35,986 - INFO - Total optimizer steps: 1470 2025-07-15 22:35:35,986 - INFO - Warmup steps: 1000 2025-07-15 22:35:35,986 - INFO - Validating gradient accumulation setup... 2025-07-15 22:35:35,986 - INFO - Validating gradient accumulation with 15 steps... 2025-07-15 22:35:43,006 - WARNING - Not enough test batches (10) for accumulation_steps (15) 2025-07-15 22:35:43,006 - INFO - Starting training for 30 epochs 2025-07-15 22:36:17,069 - INFO - log_σ² gradient: -0.700972 2025-07-15 22:36:17,150 - INFO - Optimizer step 1: log_σ²=0.000000, weight=1.000000 2025-07-15 22:36:40,805 - INFO - log_σ² gradient: -0.701121 2025-07-15 22:36:40,879 - INFO - Optimizer step 2: log_σ²=0.000001, weight=0.999999 2025-07-15 22:37:04,066 - INFO - log_σ² gradient: -0.700009 2025-07-15 22:37:04,137 - INFO - Optimizer step 3: log_σ²=0.000002, weight=0.999998 2025-07-15 22:37:25,565 - INFO - log_σ² gradient: -0.697532 2025-07-15 22:37:25,643 - INFO - Optimizer step 4: log_σ²=0.000005, weight=0.999995 2025-07-15 22:37:48,183 - INFO - log_σ² gradient: -0.699631 2025-07-15 22:37:48,258 - INFO - Optimizer step 5: log_σ²=0.000008, weight=0.999992 2025-07-15 22:38:11,163 - INFO - log_σ² gradient: -0.693053 2025-07-15 22:38:11,242 - INFO - Optimizer step 6: log_σ²=0.000012, weight=0.999988 2025-07-15 22:38:34,341 - INFO - log_σ² gradient: -0.691750 2025-07-15 22:38:34,411 - INFO - Optimizer step 7: log_σ²=0.000017, weight=0.999983 2025-07-15 22:38:56,481 - INFO - log_σ² gradient: -0.688671 2025-07-15 22:38:56,551 - INFO - Optimizer step 8: log_σ²=0.000022, weight=0.999978 2025-07-15 22:39:18,938 - INFO - log_σ² gradient: -0.680052 2025-07-15 22:39:19,016 - INFO - Optimizer step 9: log_σ²=0.000029, weight=0.999971 2025-07-15 22:39:41,060 - INFO - log_σ² gradient: -0.673574 2025-07-15 22:39:41,127 - INFO - Optimizer step 10: log_σ²=0.000036, weight=0.999964 2025-07-15 22:40:03,712 - INFO - log_σ² gradient: -0.670619 2025-07-15 22:40:03,786 - INFO - Optimizer step 11: log_σ²=0.000044, weight=0.999956 2025-07-15 22:40:26,273 - INFO - log_σ² gradient: -0.665540 2025-07-15 22:40:26,343 - INFO - Optimizer step 12: log_σ²=0.000053, weight=0.999947 2025-07-15 22:40:48,968 - INFO - log_σ² gradient: -0.663388 2025-07-15 22:40:49,041 - INFO - Optimizer step 13: log_σ²=0.000062, weight=0.999938 2025-07-15 22:41:10,405 - INFO - log_σ² gradient: -0.654793 2025-07-15 22:41:10,476 - INFO - Optimizer step 14: log_σ²=0.000072, weight=0.999928 2025-07-15 22:41:32,980 - INFO - log_σ² gradient: -0.649332 2025-07-15 22:41:33,051 - INFO - Optimizer step 15: log_σ²=0.000084, weight=0.999916 2025-07-15 22:41:54,178 - INFO - log_σ² gradient: -0.641236 2025-07-15 22:41:54,249 - INFO - Optimizer step 16: log_σ²=0.000095, weight=0.999905 2025-07-15 22:42:17,403 - INFO - log_σ² gradient: -0.633964 2025-07-15 22:42:17,473 - INFO - Optimizer step 17: log_σ²=0.000108, weight=0.999892 2025-07-15 22:42:40,563 - INFO - log_σ² gradient: -0.624257 2025-07-15 22:42:40,633 - INFO - Optimizer step 18: log_σ²=0.000121, weight=0.999879 2025-07-15 22:43:02,784 - INFO - log_σ² gradient: -0.622468 2025-07-15 22:43:02,854 - INFO - Optimizer step 19: log_σ²=0.000135, weight=0.999865 2025-07-15 22:43:26,160 - INFO - log_σ² gradient: -0.607767 2025-07-15 22:43:26,239 - INFO - Optimizer step 20: log_σ²=0.000150, weight=0.999850 2025-07-15 22:43:48,204 - INFO - log_σ² gradient: -0.596726 2025-07-15 22:43:48,274 - INFO - Optimizer step 21: log_σ²=0.000166, weight=0.999834 2025-07-15 22:44:09,668 - INFO - log_σ² gradient: -0.596824 2025-07-15 22:44:09,740 - INFO - Optimizer step 22: log_σ²=0.000182, weight=0.999818 2025-07-15 22:44:31,637 - INFO - log_σ² gradient: -0.581088 2025-07-15 22:44:31,707 - INFO - Optimizer step 23: log_σ²=0.000199, weight=0.999801 2025-07-15 22:44:51,990 - INFO - log_σ² gradient: -0.571389 2025-07-15 22:44:52,063 - INFO - Optimizer step 24: log_σ²=0.000216, weight=0.999784 2025-07-15 22:45:13,033 - INFO - log_σ² gradient: -0.565279 2025-07-15 22:45:13,106 - INFO - Optimizer step 25: log_σ²=0.000235, weight=0.999765 2025-07-15 22:45:35,878 - INFO - log_σ² gradient: -0.570360 2025-07-15 22:45:35,952 - INFO - Optimizer step 26: log_σ²=0.000254, weight=0.999746 2025-07-15 22:45:58,970 - INFO - log_σ² gradient: -0.556078 2025-07-15 22:45:59,052 - INFO - Optimizer step 27: log_σ²=0.000273, weight=0.999727 2025-07-15 22:46:21,576 - INFO - log_σ² gradient: -0.563557 2025-07-15 22:46:21,645 - INFO - Optimizer step 28: log_σ²=0.000293, weight=0.999707 2025-07-15 22:46:42,294 - INFO - log_σ² gradient: -0.564101 2025-07-15 22:46:42,360 - INFO - Optimizer step 29: log_σ²=0.000314, weight=0.999686 2025-07-15 22:47:03,167 - INFO - log_σ² gradient: -0.551421 2025-07-15 22:47:03,240 - INFO - Optimizer step 30: log_σ²=0.000336, weight=0.999664 2025-07-15 22:47:25,632 - INFO - log_σ² gradient: -0.581287 2025-07-15 22:47:25,704 - INFO - Optimizer step 31: log_σ²=0.000358, weight=0.999642 2025-07-15 22:47:46,914 - INFO - log_σ² gradient: -0.571904 2025-07-15 22:47:46,983 - INFO - Optimizer step 32: log_σ²=0.000382, weight=0.999618 2025-07-15 22:48:09,905 - INFO - log_σ² gradient: -0.576799 2025-07-15 22:48:09,983 - INFO - Optimizer step 33: log_σ²=0.000405, weight=0.999595 2025-07-15 22:48:33,089 - INFO - log_σ² gradient: -0.565765 2025-07-15 22:48:33,167 - INFO - Optimizer step 34: log_σ²=0.000430, weight=0.999570 2025-07-15 22:48:55,638 - INFO - log_σ² gradient: -0.559855 2025-07-15 22:48:55,716 - INFO - Optimizer step 35: log_σ²=0.000455, weight=0.999545 2025-07-15 22:49:18,204 - INFO - log_σ² gradient: -0.568272 2025-07-15 22:49:18,277 - INFO - Optimizer step 36: log_σ²=0.000481, weight=0.999519 2025-07-15 22:49:41,263 - INFO - log_σ² gradient: -0.561703 2025-07-15 22:49:41,337 - INFO - Optimizer step 37: log_σ²=0.000508, weight=0.999492 2025-07-15 22:50:01,940 - INFO - log_σ² gradient: -0.561730 2025-07-15 22:50:02,014 - INFO - Optimizer step 38: log_σ²=0.000536, weight=0.999465 2025-07-15 22:50:24,296 - INFO - log_σ² gradient: -0.549345 2025-07-15 22:50:24,368 - INFO - Optimizer step 39: log_σ²=0.000564, weight=0.999436 2025-07-15 22:50:47,876 - INFO - log_σ² gradient: -0.557178 2025-07-15 22:50:47,948 - INFO - Optimizer step 40: log_σ²=0.000593, weight=0.999408 2025-07-15 22:51:10,085 - INFO - log_σ² gradient: -0.563359 2025-07-15 22:51:10,158 - INFO - Optimizer step 41: log_σ²=0.000622, weight=0.999378 2025-07-15 22:51:31,799 - INFO - log_σ² gradient: -0.560122 2025-07-15 22:51:31,871 - INFO - Optimizer step 42: log_σ²=0.000653, weight=0.999348 2025-07-15 22:51:52,251 - INFO - log_σ² gradient: -0.558601 2025-07-15 22:51:52,321 - INFO - Optimizer step 43: log_σ²=0.000684, weight=0.999317 2025-07-15 22:52:13,955 - INFO - log_σ² gradient: -0.550537 2025-07-15 22:52:14,027 - INFO - Optimizer step 44: log_σ²=0.000715, weight=0.999285 2025-07-15 22:52:36,446 - INFO - log_σ² gradient: -0.560463 2025-07-15 22:52:36,518 - INFO - Optimizer step 45: log_σ²=0.000748, weight=0.999252 2025-07-15 22:52:58,608 - INFO - log_σ² gradient: -0.555325 2025-07-15 22:52:58,677 - INFO - Optimizer step 46: log_σ²=0.000781, weight=0.999219 2025-07-15 22:53:21,558 - INFO - log_σ² gradient: -0.551064 2025-07-15 22:53:21,629 - INFO - Optimizer step 47: log_σ²=0.000815, weight=0.999185 2025-07-15 22:53:42,689 - INFO - log_σ² gradient: -0.548375 2025-07-15 22:53:42,759 - INFO - Optimizer step 48: log_σ²=0.000850, weight=0.999150 2025-07-15 22:53:54,544 - INFO - log_σ² gradient: -0.249199 2025-07-15 22:53:54,621 - INFO - Optimizer step 49: log_σ²=0.000884, weight=0.999117 2025-07-15 22:53:54,929 - INFO - Epoch 1: Total optimizer steps: 49 2025-07-15 22:56:54,160 - INFO - Validation metrics: 2025-07-15 22:56:54,160 - INFO - Loss: 0.9396 2025-07-15 22:56:54,160 - INFO - BCE Loss: 0.5631 2025-07-15 22:56:54,160 - INFO - Weighted BCE Loss: 0.5626 2025-07-15 22:56:54,160 - INFO - Average similarity: 0.4324 2025-07-15 22:56:54,160 - INFO - Median similarity: 0.4372 2025-07-15 22:56:54,160 - INFO - Clean sample similarity: 0.4324 2025-07-15 22:56:54,160 - INFO - Corrupted sample similarity: 0.3555 2025-07-15 22:56:54,160 - INFO - Similarity gap (clean - corrupt): 0.0768 2025-07-15 22:56:54,345 - INFO - Epoch 1/30 - Train Loss: 1.1750, Val Loss: 0.9396, Val BCE: 0.5631, Val wBCE: 0.5626, Clean Sim: 0.4324, Corrupt Sim: 0.3555, Gap: 0.0768, Time: 1271.34s 2025-07-15 22:56:54,346 - INFO - New best validation loss: 0.9396 2025-07-15 22:56:57,621 - INFO - New best similarity gap: 0.0768 2025-07-15 22:57:33,235 - INFO - log_σ² gradient: -0.562548 2025-07-15 22:57:33,306 - INFO - Optimizer step 1: log_σ²=0.000919, weight=0.999082 2025-07-15 22:57:55,870 - INFO - log_σ² gradient: -0.560402 2025-07-15 22:57:55,945 - INFO - Optimizer step 2: log_σ²=0.000954, weight=0.999046 2025-07-15 22:58:19,210 - INFO - log_σ² gradient: -0.565008 2025-07-15 22:58:19,285 - INFO - Optimizer step 3: log_σ²=0.000991, weight=0.999010 2025-07-15 22:58:42,331 - INFO - log_σ² gradient: -0.560273 2025-07-15 22:58:42,401 - INFO - Optimizer step 4: log_σ²=0.001028, weight=0.998972 2025-07-15 22:59:05,717 - INFO - log_σ² gradient: -0.562259 2025-07-15 22:59:05,789 - INFO - Optimizer step 5: log_σ²=0.001067, weight=0.998934 2025-07-15 22:59:27,106 - INFO - log_σ² gradient: -0.559529 2025-07-15 22:59:27,176 - INFO - Optimizer step 6: log_σ²=0.001106, weight=0.998895 2025-07-15 22:59:49,931 - INFO - log_σ² gradient: -0.547505 2025-07-15 22:59:50,006 - INFO - Optimizer step 7: log_σ²=0.001146, weight=0.998855 2025-07-15 23:00:11,415 - INFO - log_σ² gradient: -0.549267 2025-07-15 23:00:11,496 - INFO - Optimizer step 8: log_σ²=0.001187, weight=0.998814 2025-07-15 23:00:34,107 - INFO - log_σ² gradient: -0.552445 2025-07-15 23:00:34,179 - INFO - Optimizer step 9: log_σ²=0.001229, weight=0.998772 2025-07-15 23:00:56,977 - INFO - log_σ² gradient: -0.552005 2025-07-15 23:00:57,050 - INFO - Optimizer step 10: log_σ²=0.001271, weight=0.998729 2025-07-15 23:01:19,193 - INFO - log_σ² gradient: -0.554567 2025-07-15 23:01:19,259 - INFO - Optimizer step 11: log_σ²=0.001315, weight=0.998686 2025-07-15 23:01:41,280 - INFO - log_σ² gradient: -0.557522 2025-07-15 23:01:41,358 - INFO - Optimizer step 12: log_σ²=0.001359, weight=0.998642 2025-07-15 23:02:03,803 - INFO - log_σ² gradient: -0.555053 2025-07-15 23:02:03,874 - INFO - Optimizer step 13: log_σ²=0.001404, weight=0.998597 2025-07-15 23:02:26,088 - INFO - log_σ² gradient: -0.551558 2025-07-15 23:02:26,163 - INFO - Optimizer step 14: log_σ²=0.001450, weight=0.998551 2025-07-15 23:02:47,931 - INFO - log_σ² gradient: -0.563379 2025-07-15 23:02:48,002 - INFO - Optimizer step 15: log_σ²=0.001497, weight=0.998504 2025-07-15 23:03:08,719 - INFO - log_σ² gradient: -0.555350 2025-07-15 23:03:08,792 - INFO - Optimizer step 16: log_σ²=0.001545, weight=0.998457 2025-07-15 23:03:30,215 - INFO - log_σ² gradient: -0.553956 2025-07-15 23:03:30,288 - INFO - Optimizer step 17: log_σ²=0.001593, weight=0.998408 2025-07-15 23:03:52,222 - INFO - log_σ² gradient: -0.555845 2025-07-15 23:03:52,292 - INFO - Optimizer step 18: log_σ²=0.001642, weight=0.998359 2025-07-15 23:04:14,642 - INFO - log_σ² gradient: -0.556693 2025-07-15 23:04:14,720 - INFO - Optimizer step 19: log_σ²=0.001692, weight=0.998309 2025-07-15 23:04:35,790 - INFO - log_σ² gradient: -0.559947 2025-07-15 23:04:35,867 - INFO - Optimizer step 20: log_σ²=0.001743, weight=0.998258 2025-07-15 23:04:59,469 - INFO - log_σ² gradient: -0.560336 2025-07-15 23:04:59,547 - INFO - Optimizer step 21: log_σ²=0.001795, weight=0.998206 2025-07-15 23:05:22,004 - INFO - log_σ² gradient: -0.548883 2025-07-15 23:05:22,077 - INFO - Optimizer step 22: log_σ²=0.001848, weight=0.998154 2025-07-15 23:05:46,542 - INFO - log_σ² gradient: -0.541997 2025-07-15 23:05:46,613 - INFO - Optimizer step 23: log_σ²=0.001901, weight=0.998101 2025-07-15 23:06:07,874 - INFO - log_σ² gradient: -0.560939 2025-07-15 23:06:07,947 - INFO - Optimizer step 24: log_σ²=0.001955, weight=0.998047 2025-07-15 23:06:30,588 - INFO - log_σ² gradient: -0.554256 2025-07-15 23:06:30,657 - INFO - Optimizer step 25: log_σ²=0.002010, weight=0.997992 2025-07-15 23:06:53,862 - INFO - log_σ² gradient: -0.550744 2025-07-15 23:06:53,936 - INFO - Optimizer step 26: log_σ²=0.002066, weight=0.997936 2025-07-15 23:07:16,417 - INFO - log_σ² gradient: -0.552476 2025-07-15 23:07:16,485 - INFO - Optimizer step 27: log_σ²=0.002122, weight=0.997880 2025-07-15 23:07:36,966 - INFO - log_σ² gradient: -0.550468 2025-07-15 23:07:37,039 - INFO - Optimizer step 28: log_σ²=0.002180, weight=0.997823 2025-07-15 23:07:58,173 - INFO - log_σ² gradient: -0.551074 2025-07-15 23:07:58,245 - INFO - Optimizer step 29: log_σ²=0.002238, weight=0.997765 2025-07-15 23:08:20,305 - INFO - log_σ² gradient: -0.546845 2025-07-15 23:08:20,375 - INFO - Optimizer step 30: log_σ²=0.002297, weight=0.997706 2025-07-15 23:08:42,396 - INFO - log_σ² gradient: -0.544335 2025-07-15 23:08:42,466 - INFO - Optimizer step 31: log_σ²=0.002356, weight=0.997647 2025-07-15 23:09:04,741 - INFO - log_σ² gradient: -0.544547 2025-07-15 23:09:04,815 - INFO - Optimizer step 32: log_σ²=0.002416, weight=0.997586 2025-07-15 23:09:28,318 - INFO - log_σ² gradient: -0.547305 2025-07-15 23:09:28,391 - INFO - Optimizer step 33: log_σ²=0.002477, weight=0.997526 2025-07-15 23:09:48,727 - INFO - log_σ² gradient: -0.546751 2025-07-15 23:09:48,799 - INFO - Optimizer step 34: log_σ²=0.002539, weight=0.997464 2025-07-15 23:10:09,451 - INFO - log_σ² gradient: -0.552852 2025-07-15 23:10:09,524 - INFO - Optimizer step 35: log_σ²=0.002602, weight=0.997401 2025-07-15 23:10:31,720 - INFO - log_σ² gradient: -0.559388 2025-07-15 23:10:31,793 - INFO - Optimizer step 36: log_σ²=0.002665, weight=0.997338 2025-07-15 23:10:53,357 - INFO - log_σ² gradient: -0.543661 2025-07-15 23:10:53,430 - INFO - Optimizer step 37: log_σ²=0.002730, weight=0.997274 2025-07-15 23:11:16,093 - INFO - log_σ² gradient: -0.550387 2025-07-15 23:11:16,170 - INFO - Optimizer step 38: log_σ²=0.002795, weight=0.997209 2025-07-15 23:11:39,591 - INFO - log_σ² gradient: -0.544392 2025-07-15 23:11:39,663 - INFO - Optimizer step 39: log_σ²=0.002860, weight=0.997144 2025-07-15 23:12:02,317 - INFO - log_σ² gradient: -0.559151 2025-07-15 23:12:02,388 - INFO - Optimizer step 40: log_σ²=0.002927, weight=0.997077 2025-07-15 23:12:24,582 - INFO - log_σ² gradient: -0.542131 2025-07-15 23:12:24,652 - INFO - Optimizer step 41: log_σ²=0.002994, weight=0.997010 2025-07-15 23:12:46,948 - INFO - log_σ² gradient: -0.544075 2025-07-15 23:12:47,018 - INFO - Optimizer step 42: log_σ²=0.003063, weight=0.996942 2025-07-15 23:13:10,016 - INFO - log_σ² gradient: -0.554919 2025-07-15 23:13:10,092 - INFO - Optimizer step 43: log_σ²=0.003132, weight=0.996873 2025-07-15 23:13:32,361 - INFO - log_σ² gradient: -0.548503 2025-07-15 23:13:32,431 - INFO - Optimizer step 44: log_σ²=0.003201, weight=0.996804 2025-07-15 23:13:53,918 - INFO - log_σ² gradient: -0.549026 2025-07-15 23:13:53,987 - INFO - Optimizer step 45: log_σ²=0.003272, weight=0.996733 2025-07-15 23:14:15,816 - INFO - log_σ² gradient: -0.547647 2025-07-15 23:14:15,888 - INFO - Optimizer step 46: log_σ²=0.003343, weight=0.996662 2025-07-15 23:14:38,355 - INFO - log_σ² gradient: -0.554392 2025-07-15 23:14:38,423 - INFO - Optimizer step 47: log_σ²=0.003415, weight=0.996590 2025-07-15 23:15:00,686 - INFO - log_σ² gradient: -0.546648 2025-07-15 23:15:00,759 - INFO - Optimizer step 48: log_σ²=0.003488, weight=0.996518 2025-07-15 23:15:10,014 - INFO - log_σ² gradient: -0.252545 2025-07-15 23:15:10,088 - INFO - Optimizer step 49: log_σ²=0.003558, weight=0.996448 2025-07-15 23:15:10,331 - INFO - Epoch 2: Total optimizer steps: 49 2025-07-15 23:18:09,864 - INFO - Validation metrics: 2025-07-15 23:18:09,864 - INFO - Loss: 0.8383 2025-07-15 23:18:09,864 - INFO - BCE Loss: 0.5520 2025-07-15 23:18:09,864 - INFO - Weighted BCE Loss: 0.5500 2025-07-15 23:18:09,864 - INFO - Average similarity: 0.5789 2025-07-15 23:18:09,864 - INFO - Median similarity: 0.6112 2025-07-15 23:18:09,864 - INFO - Clean sample similarity: 0.5789 2025-07-15 23:18:09,864 - INFO - Corrupted sample similarity: 0.4373 2025-07-15 23:18:09,864 - INFO - Similarity gap (clean - corrupt): 0.1416 2025-07-15 23:18:10,091 - INFO - Epoch 2/30 - Train Loss: 0.9187, Val Loss: 0.8383, Val BCE: 0.5520, Val wBCE: 0.5500, Clean Sim: 0.5789, Corrupt Sim: 0.4373, Gap: 0.1416, Time: 1269.35s 2025-07-15 23:18:10,091 - INFO - New best validation loss: 0.8383 2025-07-15 23:18:13,334 - INFO - New best similarity gap: 0.1416 2025-07-15 23:21:03,580 - INFO - Epoch 2 Validation Alignment: Pos=0.143, Neg=0.120, Gap=0.023 2025-07-15 23:21:36,664 - INFO - log_σ² gradient: -0.551619 2025-07-15 23:21:36,734 - INFO - Optimizer step 1: log_σ²=0.003629, weight=0.996377 2025-07-15 23:21:58,139 - INFO - log_σ² gradient: -0.541849 2025-07-15 23:21:58,212 - INFO - Optimizer step 2: log_σ²=0.003702, weight=0.996305 2025-07-15 23:22:19,812 - INFO - log_σ² gradient: -0.552321 2025-07-15 23:22:19,882 - INFO - Optimizer step 3: log_σ²=0.003775, weight=0.996232 2025-07-15 23:22:42,576 - INFO - log_σ² gradient: -0.541674 2025-07-15 23:22:42,645 - INFO - Optimizer step 4: log_σ²=0.003849, weight=0.996158 2025-07-15 23:23:04,284 - INFO - log_σ² gradient: -0.548615 2025-07-15 23:23:04,359 - INFO - Optimizer step 5: log_σ²=0.003925, weight=0.996083 2025-07-15 23:23:26,181 - INFO - log_σ² gradient: -0.545819 2025-07-15 23:23:26,252 - INFO - Optimizer step 6: log_σ²=0.004001, weight=0.996007 2025-07-15 23:23:48,709 - INFO - log_σ² gradient: -0.538285 2025-07-15 23:23:48,784 - INFO - Optimizer step 7: log_σ²=0.004078, weight=0.995930 2025-07-15 23:24:12,300 - INFO - log_σ² gradient: -0.554427 2025-07-15 23:24:12,382 - INFO - Optimizer step 8: log_σ²=0.004157, weight=0.995852 2025-07-15 23:24:34,626 - INFO - log_σ² gradient: -0.552556 2025-07-15 23:24:34,703 - INFO - Optimizer step 9: log_σ²=0.004236, weight=0.995773 2025-07-15 23:24:57,169 - INFO - log_σ² gradient: -0.560449 2025-07-15 23:24:57,243 - INFO - Optimizer step 10: log_σ²=0.004317, weight=0.995693 2025-07-15 23:25:19,372 - INFO - log_σ² gradient: -0.551200 2025-07-15 23:25:19,446 - INFO - Optimizer step 11: log_σ²=0.004398, weight=0.995612 2025-07-15 23:25:41,766 - INFO - log_σ² gradient: -0.536372 2025-07-15 23:25:41,844 - INFO - Optimizer step 12: log_σ²=0.004480, weight=0.995530 2025-07-15 23:26:05,708 - INFO - log_σ² gradient: -0.546741 2025-07-15 23:26:05,781 - INFO - Optimizer step 13: log_σ²=0.004563, weight=0.995447 2025-07-15 23:26:28,288 - INFO - log_σ² gradient: -0.552697 2025-07-15 23:26:28,362 - INFO - Optimizer step 14: log_σ²=0.004647, weight=0.995363 2025-07-15 23:26:51,601 - INFO - log_σ² gradient: -0.542606 2025-07-15 23:26:51,671 - INFO - Optimizer step 15: log_σ²=0.004732, weight=0.995279 2025-07-15 23:27:14,815 - INFO - log_σ² gradient: -0.548309 2025-07-15 23:27:14,893 - INFO - Optimizer step 16: log_σ²=0.004818, weight=0.995194 2025-07-15 23:27:38,033 - INFO - log_σ² gradient: -0.550456 2025-07-15 23:27:38,101 - INFO - Optimizer step 17: log_σ²=0.004905, weight=0.995107 2025-07-15 23:28:01,108 - INFO - log_σ² gradient: -0.546478 2025-07-15 23:28:01,179 - INFO - Optimizer step 18: log_σ²=0.004992, weight=0.995020 2025-07-15 23:28:23,829 - INFO - log_σ² gradient: -0.545807 2025-07-15 23:28:23,907 - INFO - Optimizer step 19: log_σ²=0.005080, weight=0.994933 2025-07-15 23:28:46,749 - INFO - log_σ² gradient: -0.542592 2025-07-15 23:28:46,823 - INFO - Optimizer step 20: log_σ²=0.005169, weight=0.994844 2025-07-15 23:29:08,442 - INFO - log_σ² gradient: -0.552501 2025-07-15 23:29:08,512 - INFO - Optimizer step 21: log_σ²=0.005259, weight=0.994754 2025-07-15 23:29:31,838 - INFO - log_σ² gradient: -0.541042 2025-07-15 23:29:31,909 - INFO - Optimizer step 22: log_σ²=0.005350, weight=0.994664 2025-07-15 23:29:53,805 - INFO - log_σ² gradient: -0.539292 2025-07-15 23:29:53,878 - INFO - Optimizer step 23: log_σ²=0.005442, weight=0.994573 2025-07-15 23:30:16,129 - INFO - log_σ² gradient: -0.549630 2025-07-15 23:30:16,203 - INFO - Optimizer step 24: log_σ²=0.005534, weight=0.994481 2025-07-15 23:30:37,827 - INFO - log_σ² gradient: -0.536380 2025-07-15 23:30:37,897 - INFO - Optimizer step 25: log_σ²=0.005627, weight=0.994389 2025-07-15 23:31:00,610 - INFO - log_σ² gradient: -0.539152 2025-07-15 23:31:00,688 - INFO - Optimizer step 26: log_σ²=0.005721, weight=0.994295 2025-07-15 23:31:23,490 - INFO - log_σ² gradient: -0.538578 2025-07-15 23:31:23,561 - INFO - Optimizer step 27: log_σ²=0.005815, weight=0.994201 2025-07-15 23:31:45,290 - INFO - log_σ² gradient: -0.546386 2025-07-15 23:31:45,362 - INFO - Optimizer step 28: log_σ²=0.005911, weight=0.994107 2025-07-15 23:32:07,038 - INFO - log_σ² gradient: -0.542822 2025-07-15 23:32:07,109 - INFO - Optimizer step 29: log_σ²=0.006007, weight=0.994011 2025-07-15 23:32:29,341 - INFO - log_σ² gradient: -0.548613 2025-07-15 23:32:29,417 - INFO - Optimizer step 30: log_σ²=0.006104, weight=0.993915 2025-07-15 23:32:52,607 - INFO - log_σ² gradient: -0.544888 2025-07-15 23:32:52,680 - INFO - Optimizer step 31: log_σ²=0.006202, weight=0.993817 2025-07-15 23:33:15,111 - INFO - log_σ² gradient: -0.531395 2025-07-15 23:33:15,192 - INFO - Optimizer step 32: log_σ²=0.006300, weight=0.993719 2025-07-16 08:20:09,560 - INFO - Training with parameters: 2025-07-16 08:20:09,560 - INFO - Text model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2 2025-07-16 08:20:09,560 - INFO - Audio model: facebook/w2v-bert-2.0 2025-07-16 08:20:09,560 - INFO - Freeze encoders: partial 2025-07-16 08:20:09,560 - INFO - Text layers to unfreeze: 3 2025-07-16 08:20:09,560 - INFO - Audio layers to unfreeze: 3 2025-07-16 08:20:09,560 - INFO - Use cross-modal attention: False 2025-07-16 08:20:09,560 - INFO - Use attentive pooling: False 2025-07-16 08:20:09,560 - INFO - Use word-level alignment: True 2025-07-16 08:20:09,560 - INFO - Batch size: 48 2025-07-16 08:20:09,560 - INFO - Gradient accumulation steps: 15 2025-07-16 08:20:09,560 - INFO - Effective batch size: 720 2025-07-16 08:20:09,560 - INFO - Mixed precision training: False 2025-07-16 08:20:09,560 - INFO - Learning rate: 0.0008 2025-07-16 08:20:09,560 - INFO - Temperature: 0.1 2025-07-16 08:20:09,560 - INFO - Projection dimension: 768 2025-07-16 08:20:09,560 - INFO - Training samples: 34898 2025-07-16 08:20:09,560 - INFO - Validation samples: 11252 2025-07-16 08:20:09,560 - INFO - Test samples: 11266 2025-07-16 08:20:09,560 - INFO - Max audio length: 480000 samples (30.00 seconds at 16kHz) 2025-07-16 08:20:09,560 - INFO - Loading tokenizer and feature extractor... 2025-07-16 08:20:10,654 - INFO - Feature extractor output keys: ['input_features', 'attention_mask'] 2025-07-16 08:20:10,655 - INFO - Creating datasets... 2025-07-16 08:20:10,655 - INFO - Feature extractor output keys: ['input_features', 'attention_mask'] 2025-07-16 08:20:10,655 - INFO - Feature extractor output keys: ['input_features', 'attention_mask'] 2025-07-16 08:20:10,655 - INFO - Feature extractor output keys: ['input_features', 'attention_mask'] 2025-07-16 08:20:10,655 - INFO - Creating data loaders... 2025-07-16 08:20:10,656 - INFO - Checking a sample batch... 2025-07-16 08:20:16,538 - INFO - input_ids_pos: torch.Size([48, 128]) 2025-07-16 08:20:16,538 - INFO - attention_mask_pos: torch.Size([48, 128]) 2025-07-16 08:20:16,538 - INFO - input_ids_neg: torch.Size([48, 128]) 2025-07-16 08:20:16,538 - INFO - attention_mask_neg: torch.Size([48, 128]) 2025-07-16 08:20:16,538 - INFO - input_values: torch.Size([48, 383, 160]) 2025-07-16 08:20:16,538 - INFO - attention_mask_audio: torch.Size([48, 383]) 2025-07-16 08:20:16,538 - INFO - is_corrupted: torch.Size([48]) 2025-07-16 08:20:16,538 - INFO - correctness_scores: torch.Size([48]) 2025-07-16 08:20:16,538 - INFO - Initializing model... 2025-07-16 08:20:17,577 - INFO - Text encoder hidden dim: 768 2025-07-16 08:20:17,577 - INFO - Audio encoder hidden dim: 1024 2025-07-16 08:20:17,577 - INFO - Partial freezing: unfreezing last 3 text layers and 3 audio layers 2025-07-16 08:20:17,577 - INFO - Unfreezing text encoder layer 9 2025-07-16 08:20:17,577 - INFO - Unfreezing text encoder layer 10 2025-07-16 08:20:17,577 - INFO - Unfreezing text encoder layer 11 2025-07-16 08:20:17,578 - INFO - Unfreezing audio encoder layer 21 2025-07-16 08:20:17,578 - INFO - Unfreezing audio encoder layer 22 2025-07-16 08:20:17,578 - INFO - Unfreezing audio encoder layer 23 2025-07-16 08:20:17,648 - INFO - Model initialized with 308,221,186 trainable parameters out of 879,798,082 total 2025-07-16 08:20:18,229 - INFO - Using discriminative learning rates: encoder_lr=4e-05, main_lr=0.0008 2025-07-16 08:20:18,229 - INFO - Encoder parameters: 156, Non-encoder parameters: 38 2025-07-16 08:20:18,229 - INFO - Checking if loss parameters are in optimizer... 2025-07-16 08:20:18,229 - INFO - ✓ log_sigma2_align is in optimizer 2025-07-16 08:20:18,229 - INFO - Total parameters in optimizer: 194 2025-07-16 08:20:18,230 - INFO - Model parameters: 193 2025-07-16 08:20:18,230 - INFO - Loss parameters: 1 2025-07-16 08:20:18,230 - INFO - Scheduler setup: 2025-07-16 08:20:18,230 - INFO - Batches per epoch: 727 2025-07-16 08:20:18,230 - INFO - Accumulation steps: 15 2025-07-16 08:20:18,230 - INFO - Optimizer steps per epoch: 49 2025-07-16 08:20:18,230 - INFO - Total optimizer steps: 1470 2025-07-16 08:20:18,230 - INFO - Warmup steps: 1000 2025-07-16 08:20:18,230 - INFO - Validating gradient accumulation setup... 2025-07-16 08:20:18,231 - INFO - Validating gradient accumulation with 15 steps... 2025-07-16 08:20:23,944 - WARNING - Not enough test batches (10) for accumulation_steps (15) 2025-07-16 08:20:23,944 - INFO - Starting training for 30 epochs 2025-07-16 08:20:57,096 - INFO - log_σ² gradient: -0.700972 2025-07-16 08:20:57,261 - INFO - Optimizer step 1: log_σ²=0.000000, weight=1.000000 2025-07-16 08:21:20,017 - INFO - log_σ² gradient: -0.701121 2025-07-16 08:21:20,091 - INFO - Optimizer step 2: log_σ²=0.000001, weight=0.999999 2025-07-16 08:21:42,728 - INFO - log_σ² gradient: -0.700009 2025-07-16 08:21:42,799 - INFO - Optimizer step 3: log_σ²=0.000002, weight=0.999998 2025-07-16 08:22:03,970 - INFO - log_σ² gradient: -0.697532 2025-07-16 08:22:04,048 - INFO - Optimizer step 4: log_σ²=0.000005, weight=0.999995 2025-07-16 08:22:26,157 - INFO - log_σ² gradient: -0.699631 2025-07-16 08:22:26,232 - INFO - Optimizer step 5: log_σ²=0.000008, weight=0.999992 2025-07-16 08:22:49,058 - INFO - log_σ² gradient: -0.693053 2025-07-16 08:22:49,137 - INFO - Optimizer step 6: log_σ²=0.000012, weight=0.999988 2025-07-16 08:23:11,071 - INFO - log_σ² gradient: -0.691750 2025-07-16 08:23:11,141 - INFO - Optimizer step 7: log_σ²=0.000017, weight=0.999983 2025-07-16 08:23:32,916 - INFO - log_σ² gradient: -0.688671 2025-07-16 08:23:32,985 - INFO - Optimizer step 8: log_σ²=0.000022, weight=0.999978 2025-07-16 08:23:55,358 - INFO - log_σ² gradient: -0.680052 2025-07-16 08:23:55,436 - INFO - Optimizer step 9: log_σ²=0.000029, weight=0.999971 2025-07-16 08:24:16,822 - INFO - log_σ² gradient: -0.673574 2025-07-16 08:24:16,888 - INFO - Optimizer step 10: log_σ²=0.000036, weight=0.999964 2025-07-16 08:24:38,929 - INFO - log_σ² gradient: -0.670619 2025-07-16 08:24:39,003 - INFO - Optimizer step 11: log_σ²=0.000044, weight=0.999956 2025-07-16 08:25:00,974 - INFO - log_σ² gradient: -0.665540 2025-07-16 08:25:01,044 - INFO - Optimizer step 12: log_σ²=0.000053, weight=0.999947 2025-07-16 08:25:23,732 - INFO - log_σ² gradient: -0.663388 2025-07-16 08:25:23,806 - INFO - Optimizer step 13: log_σ²=0.000062, weight=0.999938 2025-07-16 08:25:44,775 - INFO - log_σ² gradient: -0.654793 2025-07-16 08:25:44,846 - INFO - Optimizer step 14: log_σ²=0.000072, weight=0.999928 2025-07-16 08:26:06,925 - INFO - log_σ² gradient: -0.649332 2025-07-16 08:26:06,996 - INFO - Optimizer step 15: log_σ²=0.000084, weight=0.999916 2025-07-16 08:26:28,181 - INFO - log_σ² gradient: -0.641236 2025-07-16 08:26:28,252 - INFO - Optimizer step 16: log_σ²=0.000095, weight=0.999905 2025-07-16 08:26:51,215 - INFO - log_σ² gradient: -0.633964 2025-07-16 08:26:51,286 - INFO - Optimizer step 17: log_σ²=0.000108, weight=0.999892 2025-07-16 08:27:13,623 - INFO - log_σ² gradient: -0.624257 2025-07-16 08:27:13,694 - INFO - Optimizer step 18: log_σ²=0.000121, weight=0.999879 2025-07-16 08:27:35,197 - INFO - log_σ² gradient: -0.622468 2025-07-16 08:27:35,267 - INFO - Optimizer step 19: log_σ²=0.000135, weight=0.999865 2025-07-16 08:27:58,068 - INFO - log_σ² gradient: -0.607767 2025-07-16 08:27:58,147 - INFO - Optimizer step 20: log_σ²=0.000150, weight=0.999850 2025-07-16 08:28:19,645 - INFO - log_σ² gradient: -0.596726 2025-07-16 08:28:19,715 - INFO - Optimizer step 21: log_σ²=0.000166, weight=0.999834 2025-07-16 08:28:40,911 - INFO - log_σ² gradient: -0.596824 2025-07-16 08:28:40,983 - INFO - Optimizer step 22: log_σ²=0.000182, weight=0.999818 2025-07-16 08:29:02,357 - INFO - log_σ² gradient: -0.581088 2025-07-16 08:29:02,428 - INFO - Optimizer step 23: log_σ²=0.000199, weight=0.999801 2025-07-16 08:29:22,437 - INFO - log_σ² gradient: -0.571389 2025-07-16 08:29:22,511 - INFO - Optimizer step 24: log_σ²=0.000216, weight=0.999784 2025-07-16 08:29:43,021 - INFO - log_σ² gradient: -0.565279 2025-07-16 08:29:43,094 - INFO - Optimizer step 25: log_σ²=0.000235, weight=0.999765 2025-07-16 08:30:05,299 - INFO - log_σ² gradient: -0.570360 2025-07-16 08:30:05,373 - INFO - Optimizer step 26: log_σ²=0.000254, weight=0.999746 2025-07-16 08:30:27,996 - INFO - log_σ² gradient: -0.556078 2025-07-16 08:30:28,078 - INFO - Optimizer step 27: log_σ²=0.000273, weight=0.999727 2025-07-16 08:30:49,958 - INFO - log_σ² gradient: -0.563557 2025-07-16 08:30:50,027 - INFO - Optimizer step 28: log_σ²=0.000293, weight=0.999707 2025-07-16 08:31:10,522 - INFO - log_σ² gradient: -0.564101 2025-07-16 08:31:10,588 - INFO - Optimizer step 29: log_σ²=0.000314, weight=0.999686 2025-07-16 08:31:30,990 - INFO - log_σ² gradient: -0.551421 2025-07-16 08:31:31,065 - INFO - Optimizer step 30: log_σ²=0.000336, weight=0.999664 2025-07-16 08:31:53,295 - INFO - log_σ² gradient: -0.581287 2025-07-16 08:31:53,367 - INFO - Optimizer step 31: log_σ²=0.000358, weight=0.999642 2025-07-16 08:32:14,164 - INFO - log_σ² gradient: -0.571904 2025-07-16 08:32:14,234 - INFO - Optimizer step 32: log_σ²=0.000382, weight=0.999618 2025-07-16 08:32:36,727 - INFO - log_σ² gradient: -0.576799 2025-07-16 08:32:36,806 - INFO - Optimizer step 33: log_σ²=0.000405, weight=0.999595 2025-07-16 08:32:59,750 - INFO - log_σ² gradient: -0.565765 2025-07-16 08:32:59,828 - INFO - Optimizer step 34: log_σ²=0.000430, weight=0.999570 2025-07-16 08:33:22,100 - INFO - log_σ² gradient: -0.559855 2025-07-16 08:33:22,179 - INFO - Optimizer step 35: log_σ²=0.000455, weight=0.999545 2025-07-16 08:33:44,204 - INFO - log_σ² gradient: -0.568272 2025-07-16 08:33:44,277 - INFO - Optimizer step 36: log_σ²=0.000481, weight=0.999519 2025-07-16 08:34:06,526 - INFO - log_σ² gradient: -0.561703 2025-07-16 08:34:06,601 - INFO - Optimizer step 37: log_σ²=0.000508, weight=0.999492 2025-07-16 08:34:27,135 - INFO - log_σ² gradient: -0.561730 2025-07-16 08:34:27,211 - INFO - Optimizer step 38: log_σ²=0.000536, weight=0.999465 2025-07-16 08:34:48,810 - INFO - log_σ² gradient: -0.549345 2025-07-16 08:34:48,883 - INFO - Optimizer step 39: log_σ²=0.000564, weight=0.999436 2025-07-16 08:35:11,352 - INFO - log_σ² gradient: -0.557178 2025-07-16 08:35:11,424 - INFO - Optimizer step 40: log_σ²=0.000593, weight=0.999408 2025-07-16 08:35:33,069 - INFO - log_σ² gradient: -0.563359 2025-07-16 08:35:33,142 - INFO - Optimizer step 41: log_σ²=0.000622, weight=0.999378 2025-07-16 08:35:54,165 - INFO - log_σ² gradient: -0.560122 2025-07-16 08:35:54,238 - INFO - Optimizer step 42: log_σ²=0.000653, weight=0.999348 2025-07-16 08:36:14,145 - INFO - log_σ² gradient: -0.558601 2025-07-16 08:36:14,215 - INFO - Optimizer step 43: log_σ²=0.000684, weight=0.999317 2025-07-16 08:36:35,167 - INFO - log_σ² gradient: -0.550537 2025-07-16 08:36:35,240 - INFO - Optimizer step 44: log_σ²=0.000715, weight=0.999285 2025-07-16 08:36:56,928 - INFO - log_σ² gradient: -0.560463 2025-07-16 08:36:57,001 - INFO - Optimizer step 45: log_σ²=0.000748, weight=0.999252 2025-07-16 08:37:18,661 - INFO - log_σ² gradient: -0.555325 2025-07-16 08:37:18,731 - INFO - Optimizer step 46: log_σ²=0.000781, weight=0.999219 2025-07-16 08:37:40,889 - INFO - log_σ² gradient: -0.551064 2025-07-16 08:37:40,960 - INFO - Optimizer step 47: log_σ²=0.000815, weight=0.999185 2025-07-16 08:38:01,660 - INFO - log_σ² gradient: -0.548375 2025-07-16 08:38:01,730 - INFO - Optimizer step 48: log_σ²=0.000850, weight=0.999150 2025-07-16 08:38:13,173 - INFO - log_σ² gradient: -0.249199 2025-07-16 08:38:13,251 - INFO - Optimizer step 49: log_σ²=0.000884, weight=0.999117 2025-07-16 08:38:13,457 - INFO - Epoch 1: Total optimizer steps: 49 2025-07-16 08:41:10,459 - INFO - Validation metrics: 2025-07-16 08:41:10,459 - INFO - Loss: 0.9396 2025-07-16 08:41:10,459 - INFO - BCE Loss: 0.5631 2025-07-16 08:41:10,460 - INFO - Weighted BCE Loss: 0.5626 2025-07-16 08:41:10,460 - INFO - Average similarity: 0.4324 2025-07-16 08:41:10,460 - INFO - Median similarity: 0.4372 2025-07-16 08:41:10,460 - INFO - Clean sample similarity: 0.4324 2025-07-16 08:41:10,460 - INFO - Corrupted sample similarity: 0.3555 2025-07-16 08:41:10,460 - INFO - Similarity gap (clean - corrupt): 0.0768 2025-07-16 08:41:10,684 - INFO - Epoch 1/30 - Train Loss: 1.1750, Val Loss: 0.9396, Val BCE: 0.5631, Val wBCE: 0.5626, Clean Sim: 0.4324, Corrupt Sim: 0.3555, Gap: 0.0768, Time: 1246.74s 2025-07-16 08:41:10,684 - INFO - New best validation loss: 0.9396 2025-07-16 08:41:13,264 - INFO - New best similarity gap: 0.0768 2025-07-16 08:41:48,449 - INFO - log_σ² gradient: -0.562548 2025-07-16 08:41:48,520 - INFO - Optimizer step 1: log_σ²=0.000919, weight=0.999082 2025-07-16 08:42:10,364 - INFO - log_σ² gradient: -0.560402 2025-07-16 08:42:10,438 - INFO - Optimizer step 2: log_σ²=0.000954, weight=0.999046 2025-07-16 08:42:33,029 - INFO - log_σ² gradient: -0.565008 2025-07-16 08:42:33,104 - INFO - Optimizer step 3: log_σ²=0.000991, weight=0.999010 2025-07-16 08:42:55,732 - INFO - log_σ² gradient: -0.560273 2025-07-16 08:42:55,803 - INFO - Optimizer step 4: log_σ²=0.001028, weight=0.998972 2025-07-16 08:43:18,713 - INFO - log_σ² gradient: -0.562259 2025-07-16 08:43:18,785 - INFO - Optimizer step 5: log_σ²=0.001067, weight=0.998934 2025-07-16 08:43:39,594 - INFO - log_σ² gradient: -0.559529 2025-07-16 08:43:39,665 - INFO - Optimizer step 6: log_σ²=0.001106, weight=0.998895 2025-07-16 08:44:02,024 - INFO - log_σ² gradient: -0.547505 2025-07-16 08:44:02,099 - INFO - Optimizer step 7: log_σ²=0.001146, weight=0.998855 2025-07-16 08:44:23,126 - INFO - log_σ² gradient: -0.549267 2025-07-16 08:44:23,207 - INFO - Optimizer step 8: log_σ²=0.001187, weight=0.998814 2025-07-16 08:44:45,583 - INFO - log_σ² gradient: -0.552445 2025-07-16 08:44:45,655 - INFO - Optimizer step 9: log_σ²=0.001229, weight=0.998772 2025-07-16 08:45:08,444 - INFO - log_σ² gradient: -0.552005 2025-07-16 08:45:08,519 - INFO - Optimizer step 10: log_σ²=0.001271, weight=0.998729 2025-07-16 08:45:30,160 - INFO - log_σ² gradient: -0.554567 2025-07-16 08:45:30,226 - INFO - Optimizer step 11: log_σ²=0.001315, weight=0.998686 2025-07-16 08:45:51,893 - INFO - log_σ² gradient: -0.557522 2025-07-16 08:45:51,971 - INFO - Optimizer step 12: log_σ²=0.001359, weight=0.998642 2025-07-16 08:46:14,034 - INFO - log_σ² gradient: -0.555053 2025-07-16 08:46:14,105 - INFO - Optimizer step 13: log_σ²=0.001404, weight=0.998597 2025-07-16 08:46:35,915 - INFO - log_σ² gradient: -0.551558 2025-07-16 08:46:35,991 - INFO - Optimizer step 14: log_σ²=0.001450, weight=0.998551 2025-07-16 08:46:57,325 - INFO - log_σ² gradient: -0.563379 2025-07-16 08:46:57,396 - INFO - Optimizer step 15: log_σ²=0.001497, weight=0.998504 2025-07-16 08:47:17,460 - INFO - log_σ² gradient: -0.555350 2025-07-16 08:47:17,533 - INFO - Optimizer step 16: log_σ²=0.001545, weight=0.998457 2025-07-16 08:47:38,387 - INFO - log_σ² gradient: -0.553956 2025-07-16 08:47:38,460 - INFO - Optimizer step 17: log_σ²=0.001593, weight=0.998408 2025-07-16 08:48:00,532 - INFO - log_σ² gradient: -0.555845 2025-07-16 08:48:00,602 - INFO - Optimizer step 18: log_σ²=0.001642, weight=0.998359 2025-07-16 08:48:22,538 - INFO - log_σ² gradient: -0.556693 2025-07-16 08:48:22,616 - INFO - Optimizer step 19: log_σ²=0.001692, weight=0.998309 2025-07-16 08:48:43,168 - INFO - log_σ² gradient: -0.559947 2025-07-16 08:48:43,246 - INFO - Optimizer step 20: log_σ²=0.001743, weight=0.998258 2025-07-16 08:49:06,364 - INFO - log_σ² gradient: -0.560336 2025-07-16 08:49:06,442 - INFO - Optimizer step 21: log_σ²=0.001795, weight=0.998206 2025-07-16 08:49:28,446 - INFO - log_σ² gradient: -0.548883 2025-07-16 08:49:28,519 - INFO - Optimizer step 22: log_σ²=0.001848, weight=0.998154 2025-07-16 08:49:51,815 - INFO - log_σ² gradient: -0.541997 2025-07-16 08:49:51,886 - INFO - Optimizer step 23: log_σ²=0.001901, weight=0.998101 2025-07-16 08:50:12,700 - INFO - log_σ² gradient: -0.560939 2025-07-16 08:50:12,773 - INFO - Optimizer step 24: log_σ²=0.001955, weight=0.998047 2025-07-16 08:50:34,880 - INFO - log_σ² gradient: -0.554256 2025-07-16 08:50:34,949 - INFO - Optimizer step 25: log_σ²=0.002010, weight=0.997992 2025-07-16 08:50:57,201 - INFO - log_σ² gradient: -0.550744 2025-07-16 08:50:57,275 - INFO - Optimizer step 26: log_σ²=0.002066, weight=0.997936 2025-07-16 08:51:19,575 - INFO - log_σ² gradient: -0.552476 2025-07-16 08:51:19,642 - INFO - Optimizer step 27: log_σ²=0.002122, weight=0.997880 2025-07-16 08:51:40,096 - INFO - log_σ² gradient: -0.550468 2025-07-16 08:51:40,168 - INFO - Optimizer step 28: log_σ²=0.002180, weight=0.997823 2025-07-16 08:52:01,070 - INFO - log_σ² gradient: -0.551074 2025-07-16 08:52:01,142 - INFO - Optimizer step 29: log_σ²=0.002238, weight=0.997765 2025-07-16 08:52:22,509 - INFO - log_σ² gradient: -0.546845 2025-07-16 08:52:22,580 - INFO - Optimizer step 30: log_σ²=0.002297, weight=0.997706 2025-07-16 08:52:44,653 - INFO - log_σ² gradient: -0.544335 2025-07-16 08:52:44,724 - INFO - Optimizer step 31: log_σ²=0.002356, weight=0.997647 2025-07-16 08:53:06,674 - INFO - log_σ² gradient: -0.544547 2025-07-16 08:53:06,748 - INFO - Optimizer step 32: log_σ²=0.002416, weight=0.997586 2025-07-16 08:53:29,518 - INFO - log_σ² gradient: -0.547305 2025-07-16 08:53:29,593 - INFO - Optimizer step 33: log_σ²=0.002477, weight=0.997526 2025-07-16 08:53:49,884 - INFO - log_σ² gradient: -0.546751 2025-07-16 08:53:49,955 - INFO - Optimizer step 34: log_σ²=0.002539, weight=0.997464 2025-07-16 08:54:10,197 - INFO - log_σ² gradient: -0.552852 2025-07-16 08:54:10,269 - INFO - Optimizer step 35: log_σ²=0.002602, weight=0.997401 2025-07-16 08:54:32,077 - INFO - log_σ² gradient: -0.559388 2025-07-16 08:54:32,150 - INFO - Optimizer step 36: log_σ²=0.002665, weight=0.997338 2025-07-16 08:54:53,281 - INFO - log_σ² gradient: -0.543661 2025-07-16 08:54:53,353 - INFO - Optimizer step 37: log_σ²=0.002730, weight=0.997274 2025-07-16 08:55:15,466 - INFO - log_σ² gradient: -0.550387 2025-07-16 08:55:15,544 - INFO - Optimizer step 38: log_σ²=0.002795, weight=0.997209 2025-07-16 08:55:38,557 - INFO - log_σ² gradient: -0.544392 2025-07-16 08:55:38,630 - INFO - Optimizer step 39: log_σ²=0.002860, weight=0.997144 2025-07-16 08:56:00,832 - INFO - log_σ² gradient: -0.559151 2025-07-16 08:56:00,903 - INFO - Optimizer step 40: log_σ²=0.002927, weight=0.997077 2025-07-16 08:56:22,543 - INFO - log_σ² gradient: -0.542131 2025-07-16 08:56:22,614 - INFO - Optimizer step 41: log_σ²=0.002994, weight=0.997010 2025-07-16 08:56:44,262 - INFO - log_σ² gradient: -0.544075 2025-07-16 08:56:44,333 - INFO - Optimizer step 42: log_σ²=0.003063, weight=0.996942 2025-07-16 08:57:06,702 - INFO - log_σ² gradient: -0.554919 2025-07-16 08:57:06,780 - INFO - Optimizer step 43: log_σ²=0.003132, weight=0.996873 2025-07-16 08:57:28,967 - INFO - log_σ² gradient: -0.548503 2025-07-16 08:57:29,037 - INFO - Optimizer step 44: log_σ²=0.003201, weight=0.996804 2025-07-16 08:57:50,334 - INFO - log_σ² gradient: -0.549026 2025-07-16 08:57:50,402 - INFO - Optimizer step 45: log_σ²=0.003272, weight=0.996733 2025-07-16 08:58:11,838 - INFO - log_σ² gradient: -0.547647 2025-07-16 08:58:11,910 - INFO - Optimizer step 46: log_σ²=0.003343, weight=0.996662 2025-07-16 08:58:33,581 - INFO - log_σ² gradient: -0.554392 2025-07-16 08:58:33,649 - INFO - Optimizer step 47: log_σ²=0.003415, weight=0.996590 2025-07-16 08:58:55,433 - INFO - log_σ² gradient: -0.546648 2025-07-16 08:58:55,506 - INFO - Optimizer step 48: log_σ²=0.003488, weight=0.996518 2025-07-16 08:59:04,386 - INFO - log_σ² gradient: -0.252545 2025-07-16 08:59:04,459 - INFO - Optimizer step 49: log_σ²=0.003558, weight=0.996448 2025-07-16 08:59:04,671 - INFO - Epoch 2: Total optimizer steps: 49 2025-07-16 09:02:01,410 - INFO - Validation metrics: 2025-07-16 09:02:01,410 - INFO - Loss: 0.8383 2025-07-16 09:02:01,410 - INFO - BCE Loss: 0.5520 2025-07-16 09:02:01,410 - INFO - Weighted BCE Loss: 0.5500 2025-07-16 09:02:01,410 - INFO - Average similarity: 0.5789 2025-07-16 09:02:01,410 - INFO - Median similarity: 0.6112 2025-07-16 09:02:01,410 - INFO - Clean sample similarity: 0.5789 2025-07-16 09:02:01,410 - INFO - Corrupted sample similarity: 0.4373 2025-07-16 09:02:01,410 - INFO - Similarity gap (clean - corrupt): 0.1416 2025-07-16 09:02:01,544 - INFO - Epoch 2/30 - Train Loss: 0.9187, Val Loss: 0.8383, Val BCE: 0.5520, Val wBCE: 0.5500, Clean Sim: 0.5789, Corrupt Sim: 0.4373, Gap: 0.1416, Time: 1244.98s 2025-07-16 09:02:01,545 - INFO - New best validation loss: 0.8383 2025-07-16 09:02:04,235 - INFO - New best similarity gap: 0.1416 2025-07-16 09:04:51,950 - INFO - Epoch 2 Validation Alignment: Pos=0.143, Neg=0.120, Gap=0.023 2025-07-16 09:05:24,788 - INFO - log_σ² gradient: -0.551619 2025-07-16 09:05:24,858 - INFO - Optimizer step 1: log_σ²=0.003629, weight=0.996377 2025-07-16 09:05:46,090 - INFO - log_σ² gradient: -0.541849 2025-07-16 09:05:46,162 - INFO - Optimizer step 2: log_σ²=0.003702, weight=0.996305 2025-07-16 09:06:08,277 - INFO - log_σ² gradient: -0.552321 2025-07-16 09:06:08,347 - INFO - Optimizer step 3: log_σ²=0.003775, weight=0.996232 2025-07-16 09:06:30,984 - INFO - log_σ² gradient: -0.541674 2025-07-16 09:06:31,053 - INFO - Optimizer step 4: log_σ²=0.003849, weight=0.996158 2025-07-16 09:06:52,054 - INFO - log_σ² gradient: -0.548615 2025-07-16 09:06:52,128 - INFO - Optimizer step 5: log_σ²=0.003925, weight=0.996083 2025-07-16 09:07:13,675 - INFO - log_σ² gradient: -0.545819 2025-07-16 09:07:13,746 - INFO - Optimizer step 6: log_σ²=0.004001, weight=0.996007 2025-07-16 09:07:35,967 - INFO - log_σ² gradient: -0.538285 2025-07-16 09:07:36,042 - INFO - Optimizer step 7: log_σ²=0.004078, weight=0.995930 2025-07-16 09:07:59,392 - INFO - log_σ² gradient: -0.554427 2025-07-16 09:07:59,475 - INFO - Optimizer step 8: log_σ²=0.004157, weight=0.995852 2025-07-16 09:08:21,320 - INFO - log_σ² gradient: -0.552556 2025-07-16 09:08:21,397 - INFO - Optimizer step 9: log_σ²=0.004236, weight=0.995773 2025-07-16 09:08:43,207 - INFO - log_σ² gradient: -0.560449 2025-07-16 09:08:43,282 - INFO - Optimizer step 10: log_σ²=0.004317, weight=0.995693 2025-07-16 09:09:04,761 - INFO - log_σ² gradient: -0.551200 2025-07-16 09:09:04,835 - INFO - Optimizer step 11: log_σ²=0.004398, weight=0.995612 2025-07-16 09:09:26,549 - INFO - log_σ² gradient: -0.536372 2025-07-16 09:09:26,627 - INFO - Optimizer step 12: log_σ²=0.004480, weight=0.995530 2025-07-16 09:09:50,208 - INFO - log_σ² gradient: -0.546741 2025-07-16 09:09:50,280 - INFO - Optimizer step 13: log_σ²=0.004563, weight=0.995447 2025-07-16 09:10:12,722 - INFO - log_σ² gradient: -0.552697 2025-07-16 09:10:12,796 - INFO - Optimizer step 14: log_σ²=0.004647, weight=0.995363 2025-07-16 09:10:35,912 - INFO - log_σ² gradient: -0.542606 2025-07-16 09:10:35,983 - INFO - Optimizer step 15: log_σ²=0.004732, weight=0.995279 2025-07-16 09:10:58,937 - INFO - log_σ² gradient: -0.548309 2025-07-16 09:10:59,015 - INFO - Optimizer step 16: log_σ²=0.004818, weight=0.995194 2025-07-16 09:11:21,290 - INFO - log_σ² gradient: -0.550456 2025-07-16 09:11:21,359 - INFO - Optimizer step 17: log_σ²=0.004905, weight=0.995107 2025-07-16 09:11:44,051 - INFO - log_σ² gradient: -0.546478 2025-07-16 09:11:44,123 - INFO - Optimizer step 18: log_σ²=0.004992, weight=0.995020 2025-07-16 09:12:06,309 - INFO - log_σ² gradient: -0.545807 2025-07-16 09:12:06,386 - INFO - Optimizer step 19: log_σ²=0.005080, weight=0.994933 2025-07-16 09:12:29,351 - INFO - log_σ² gradient: -0.542592 2025-07-16 09:12:29,425 - INFO - Optimizer step 20: log_σ²=0.005169, weight=0.994844 2025-07-16 09:12:50,620 - INFO - log_σ² gradient: -0.552501 2025-07-16 09:12:50,690 - INFO - Optimizer step 21: log_σ²=0.005259, weight=0.994754 2025-07-16 09:13:13,540 - INFO - log_σ² gradient: -0.541042 2025-07-16 09:13:13,611 - INFO - Optimizer step 22: log_σ²=0.005350, weight=0.994664 2025-07-16 09:13:34,848 - INFO - log_σ² gradient: -0.539292 2025-07-16 09:13:34,921 - INFO - Optimizer step 23: log_σ²=0.005442, weight=0.994573 2025-07-16 09:13:56,298 - INFO - log_σ² gradient: -0.549630 2025-07-16 09:13:56,372 - INFO - Optimizer step 24: log_σ²=0.005534, weight=0.994481 2025-07-16 09:14:18,305 - INFO - log_σ² gradient: -0.536380 2025-07-16 09:14:18,375 - INFO - Optimizer step 25: log_σ²=0.005627, weight=0.994389 2025-07-16 09:14:40,580 - INFO - log_σ² gradient: -0.539152 2025-07-16 09:14:40,657 - INFO - Optimizer step 26: log_σ²=0.005721, weight=0.994295 2025-07-16 09:15:03,029 - INFO - log_σ² gradient: -0.538578 2025-07-16 09:15:03,099 - INFO - Optimizer step 27: log_σ²=0.005815, weight=0.994201 2025-07-16 09:15:24,452 - INFO - log_σ² gradient: -0.546386 2025-07-16 09:15:24,525 - INFO - Optimizer step 28: log_σ²=0.005911, weight=0.994107 2025-07-16 09:15:45,931 - INFO - log_σ² gradient: -0.542822 2025-07-16 09:15:46,001 - INFO - Optimizer step 29: log_σ²=0.006007, weight=0.994011 2025-07-16 09:16:07,533 - INFO - log_σ² gradient: -0.548613 2025-07-16 09:16:07,608 - INFO - Optimizer step 30: log_σ²=0.006104, weight=0.993915 2025-07-16 09:16:29,801 - INFO - log_σ² gradient: -0.544888 2025-07-16 09:16:29,875 - INFO - Optimizer step 31: log_σ²=0.006202, weight=0.993817 2025-07-16 09:16:52,085 - INFO - log_σ² gradient: -0.531395 2025-07-16 09:16:52,166 - INFO - Optimizer step 32: log_σ²=0.006300, weight=0.993719 2025-07-16 09:17:13,460 - INFO - log_σ² gradient: -0.541929 2025-07-16 09:17:13,534 - INFO - Optimizer step 33: log_σ²=0.006400, weight=0.993621 2025-07-16 09:17:35,966 - INFO - log_σ² gradient: -0.551742 2025-07-16 09:17:36,036 - INFO - Optimizer step 34: log_σ²=0.006500, weight=0.993521 2025-07-16 09:17:58,109 - INFO - log_σ² gradient: -0.546808 2025-07-16 09:17:58,175 - INFO - Optimizer step 35: log_σ²=0.006601, weight=0.993421 2025-07-16 09:18:19,945 - INFO - log_σ² gradient: -0.555860 2025-07-16 09:18:20,019 - INFO - Optimizer step 36: log_σ²=0.006703, weight=0.993319 2025-07-16 09:18:42,160 - INFO - log_σ² gradient: -0.548109 2025-07-16 09:18:42,230 - INFO - Optimizer step 37: log_σ²=0.006806, weight=0.993217 2025-07-16 09:19:03,344 - INFO - log_σ² gradient: -0.533132 2025-07-16 09:19:03,417 - INFO - Optimizer step 38: log_σ²=0.006910, weight=0.993114 2025-07-16 09:19:25,944 - INFO - log_σ² gradient: -0.545496 2025-07-16 09:19:26,017 - INFO - Optimizer step 39: log_σ²=0.007014, weight=0.993011 2025-07-16 09:19:48,753 - INFO - log_σ² gradient: -0.553328 2025-07-16 09:19:48,841 - INFO - Optimizer step 40: log_σ²=0.007119, weight=0.992906 2025-07-16 09:20:09,800 - INFO - log_σ² gradient: -0.538834 2025-07-16 09:20:09,871 - INFO - Optimizer step 41: log_σ²=0.007225, weight=0.992801 2025-07-16 09:20:31,089 - INFO - log_σ² gradient: -0.547153 2025-07-16 09:20:31,168 - INFO - Optimizer step 42: log_σ²=0.007332, weight=0.992695 2025-07-16 09:20:53,007 - INFO - log_σ² gradient: -0.550409 2025-07-16 09:20:53,082 - INFO - Optimizer step 43: log_σ²=0.007440, weight=0.992588 2025-07-16 09:21:15,395 - INFO - log_σ² gradient: -0.552833 2025-07-16 09:21:15,467 - INFO - Optimizer step 44: log_σ²=0.007549, weight=0.992480 2025-07-16 09:21:37,553 - INFO - log_σ² gradient: -0.542730 2025-07-16 09:21:37,625 - INFO - Optimizer step 45: log_σ²=0.007658, weight=0.992371 2025-07-16 09:21:59,155 - INFO - log_σ² gradient: -0.540870 2025-07-16 09:21:59,226 - INFO - Optimizer step 46: log_σ²=0.007768, weight=0.992262 2025-07-16 09:22:21,256 - INFO - log_σ² gradient: -0.546690 2025-07-16 09:22:21,328 - INFO - Optimizer step 47: log_σ²=0.007879, weight=0.992152 2025-07-16 09:22:44,026 - INFO - log_σ² gradient: -0.546634 2025-07-16 09:22:44,098 - INFO - Optimizer step 48: log_σ²=0.007991, weight=0.992041 2025-07-16 09:22:54,644 - INFO - log_σ² gradient: -0.250474 2025-07-16 09:22:54,718 - INFO - Optimizer step 49: log_σ²=0.008098, weight=0.991935 2025-07-16 09:22:54,945 - INFO - Epoch 3: Total optimizer steps: 49 2025-07-16 09:25:51,965 - INFO - Validation metrics: 2025-07-16 09:25:51,966 - INFO - Loss: 0.7903 2025-07-16 09:25:51,966 - INFO - BCE Loss: 0.5442 2025-07-16 09:25:51,966 - INFO - Weighted BCE Loss: 0.5398 2025-07-16 09:25:51,966 - INFO - Average similarity: 0.6428 2025-07-16 09:25:51,966 - INFO - Median similarity: 0.6840 2025-07-16 09:25:51,966 - INFO - Clean sample similarity: 0.6428 2025-07-16 09:25:51,966 - INFO - Corrupted sample similarity: 0.4705 2025-07-16 09:25:51,966 - INFO - Similarity gap (clean - corrupt): 0.1723 2025-07-16 09:25:52,160 - INFO - Epoch 3/30 - Train Loss: 0.8530, Val Loss: 0.7903, Val BCE: 0.5442, Val wBCE: 0.5398, Clean Sim: 0.6428, Corrupt Sim: 0.4705, Gap: 0.1723, Time: 1260.21s 2025-07-16 09:25:52,160 - INFO - New best validation loss: 0.7903 2025-07-16 09:25:54,849 - INFO - New best similarity gap: 0.1723 2025-07-16 09:26:31,233 - INFO - log_σ² gradient: -0.537983 2025-07-16 09:26:31,301 - INFO - Optimizer step 1: log_σ²=0.008206, weight=0.991828 2025-07-16 09:26:51,802 - INFO - log_σ² gradient: -0.551821 2025-07-16 09:26:51,869 - INFO - Optimizer step 2: log_σ²=0.008315, weight=0.991719 2025-07-16 09:27:13,230 - INFO - log_σ² gradient: -0.540283 2025-07-16 09:27:13,305 - INFO - Optimizer step 3: log_σ²=0.008426, weight=0.991610 2025-07-16 09:27:36,192 - INFO - log_σ² gradient: -0.531610 2025-07-16 09:27:36,270 - INFO - Optimizer step 4: log_σ²=0.008537, weight=0.991499 2025-07-16 09:27:58,676 - INFO - log_σ² gradient: -0.536713 2025-07-16 09:27:58,749 - INFO - Optimizer step 5: log_σ²=0.008650, weight=0.991387 2025-07-16 09:28:21,229 - INFO - log_σ² gradient: -0.525864 2025-07-16 09:28:21,300 - INFO - Optimizer step 6: log_σ²=0.008764, weight=0.991275 2025-07-16 09:28:42,280 - INFO - log_σ² gradient: -0.537721 2025-07-16 09:28:42,358 - INFO - Optimizer step 7: log_σ²=0.008878, weight=0.991161 2025-07-16 09:29:03,980 - INFO - log_σ² gradient: -0.530747 2025-07-16 09:29:04,054 - INFO - Optimizer step 8: log_σ²=0.008993, weight=0.991047 2025-07-16 09:29:25,455 - INFO - log_σ² gradient: -0.539865 2025-07-16 09:29:25,527 - INFO - Optimizer step 9: log_σ²=0.009110, weight=0.990932 2025-07-16 09:29:46,825 - INFO - log_σ² gradient: -0.537389 2025-07-16 09:29:46,903 - INFO - Optimizer step 10: log_σ²=0.009227, weight=0.990815 2025-07-16 09:30:08,995 - INFO - log_σ² gradient: -0.549306 2025-07-16 09:30:09,074 - INFO - Optimizer step 11: log_σ²=0.009346, weight=0.990698 2025-07-16 09:30:31,863 - INFO - log_σ² gradient: -0.525880 2025-07-16 09:30:31,934 - INFO - Optimizer step 12: log_σ²=0.009465, weight=0.990579 2025-07-16 09:30:53,465 - INFO - log_σ² gradient: -0.537507 2025-07-16 09:30:53,538 - INFO - Optimizer step 13: log_σ²=0.009586, weight=0.990460 2025-07-16 09:31:16,111 - INFO - log_σ² gradient: -0.545154 2025-07-16 09:31:16,192 - INFO - Optimizer step 14: log_σ²=0.009707, weight=0.990340 2025-07-16 09:31:37,831 - INFO - log_σ² gradient: -0.541453 2025-07-16 09:31:37,904 - INFO - Optimizer step 15: log_σ²=0.009829, weight=0.990219 2025-07-16 09:31:58,931 - INFO - log_σ² gradient: -0.537137 2025-07-16 09:31:59,006 - INFO - Optimizer step 16: log_σ²=0.009953, weight=0.990097 2025-07-16 09:32:22,421 - INFO - log_σ² gradient: -0.537704 2025-07-16 09:32:22,507 - INFO - Optimizer step 17: log_σ²=0.010077, weight=0.989974 2025-07-16 09:32:45,042 - INFO - log_σ² gradient: -0.544489 2025-07-16 09:32:45,115 - INFO - Optimizer step 18: log_σ²=0.010202, weight=0.989850 2025-07-16 09:33:07,217 - INFO - log_σ² gradient: -0.544150 2025-07-16 09:33:07,289 - INFO - Optimizer step 19: log_σ²=0.010328, weight=0.989725 2025-07-16 09:33:29,430 - INFO - log_σ² gradient: -0.532263 2025-07-16 09:33:29,502 - INFO - Optimizer step 20: log_σ²=0.010455, weight=0.989600 2025-07-16 09:33:53,641 - INFO - log_σ² gradient: -0.543756 2025-07-16 09:33:53,711 - INFO - Optimizer step 21: log_σ²=0.010583, weight=0.989473 2025-07-16 09:34:15,916 - INFO - log_σ² gradient: -0.541414 2025-07-16 09:34:15,990 - INFO - Optimizer step 22: log_σ²=0.010711, weight=0.989346 2025-07-16 09:34:39,076 - INFO - log_σ² gradient: -0.537287 2025-07-16 09:34:39,146 - INFO - Optimizer step 23: log_σ²=0.010841, weight=0.989218 2025-07-16 09:35:01,296 - INFO - log_σ² gradient: -0.532516 2025-07-16 09:35:01,374 - INFO - Optimizer step 24: log_σ²=0.010971, weight=0.989089 2025-07-16 09:35:23,020 - INFO - log_σ² gradient: -0.541242 2025-07-16 09:35:23,094 - INFO - Optimizer step 25: log_σ²=0.011102, weight=0.988959 2025-07-16 09:35:44,495 - INFO - log_σ² gradient: -0.545674 2025-07-16 09:35:44,565 - INFO - Optimizer step 26: log_σ²=0.011234, weight=0.988829 2025-07-16 09:36:06,210 - INFO - log_σ² gradient: -0.538228 2025-07-16 09:36:06,280 - INFO - Optimizer step 27: log_σ²=0.011367, weight=0.988697 2025-07-16 09:36:27,023 - INFO - log_σ² gradient: -0.531971 2025-07-16 09:36:27,094 - INFO - Optimizer step 28: log_σ²=0.011501, weight=0.988565 2025-07-16 09:36:50,427 - INFO - log_σ² gradient: -0.540218 2025-07-16 09:36:50,506 - INFO - Optimizer step 29: log_σ²=0.011635, weight=0.988432 2025-07-16 09:37:12,839 - INFO - log_σ² gradient: -0.546098 2025-07-16 09:37:12,912 - INFO - Optimizer step 30: log_σ²=0.011771, weight=0.988298 2025-07-16 09:37:34,045 - INFO - log_σ² gradient: -0.542720 2025-07-16 09:37:34,120 - INFO - Optimizer step 31: log_σ²=0.011907, weight=0.988164 2025-07-16 09:37:56,233 - INFO - log_σ² gradient: -0.520801 2025-07-16 09:37:56,306 - INFO - Optimizer step 32: log_σ²=0.012044, weight=0.988029 2025-07-16 09:38:18,092 - INFO - log_σ² gradient: -0.533730 2025-07-16 09:38:18,164 - INFO - Optimizer step 33: log_σ²=0.012181, weight=0.987893 2025-07-16 09:38:40,151 - INFO - log_σ² gradient: -0.538865 2025-07-16 09:38:40,224 - INFO - Optimizer step 34: log_σ²=0.012319, weight=0.987756 2025-07-16 09:39:02,282 - INFO - log_σ² gradient: -0.528489 2025-07-16 09:39:02,354 - INFO - Optimizer step 35: log_σ²=0.012458, weight=0.987619 2025-07-16 09:39:24,585 - INFO - log_σ² gradient: -0.532723 2025-07-16 09:39:24,658 - INFO - Optimizer step 36: log_σ²=0.012598, weight=0.987481 2025-07-16 09:39:46,017 - INFO - log_σ² gradient: -0.539173 2025-07-16 09:39:46,086 - INFO - Optimizer step 37: log_σ²=0.012738, weight=0.987342 2025-07-16 09:40:08,492 - INFO - log_σ² gradient: -0.542606 2025-07-16 09:40:08,564 - INFO - Optimizer step 38: log_σ²=0.012880, weight=0.987203 2025-07-16 09:40:29,928 - INFO - log_σ² gradient: -0.529490 2025-07-16 09:40:29,998 - INFO - Optimizer step 39: log_σ²=0.013022, weight=0.987063 2025-07-16 09:40:52,119 - INFO - log_σ² gradient: -0.530504 2025-07-16 09:40:52,192 - INFO - Optimizer step 40: log_σ²=0.013165, weight=0.986922 2025-07-16 09:41:15,006 - INFO - log_σ² gradient: -0.537062 2025-07-16 09:41:15,079 - INFO - Optimizer step 41: log_σ²=0.013308, weight=0.986780 2025-07-16 09:41:37,643 - INFO - log_σ² gradient: -0.528151 2025-07-16 09:41:37,715 - INFO - Optimizer step 42: log_σ²=0.013452, weight=0.986638 2025-07-16 09:42:01,185 - INFO - log_σ² gradient: -0.539426 2025-07-16 09:42:01,258 - INFO - Optimizer step 43: log_σ²=0.013597, weight=0.986495 2025-07-16 09:42:23,160 - INFO - log_σ² gradient: -0.539894 2025-07-16 09:42:23,235 - INFO - Optimizer step 44: log_σ²=0.013743, weight=0.986351 2025-07-16 09:42:46,302 - INFO - log_σ² gradient: -0.553312 2025-07-16 09:42:46,374 - INFO - Optimizer step 45: log_σ²=0.013891, weight=0.986205 2025-07-16 09:43:08,007 - INFO - log_σ² gradient: -0.552761 2025-07-16 09:43:08,086 - INFO - Optimizer step 46: log_σ²=0.014039, weight=0.986059 2025-07-16 09:43:31,068 - INFO - log_σ² gradient: -0.544273 2025-07-16 09:43:31,138 - INFO - Optimizer step 47: log_σ²=0.014188, weight=0.985912 2025-07-16 09:43:52,465 - INFO - log_σ² gradient: -0.546915 2025-07-16 09:43:52,537 - INFO - Optimizer step 48: log_σ²=0.014339, weight=0.985764 2025-07-16 09:44:02,718 - INFO - log_σ² gradient: -0.259212 2025-07-16 09:44:02,793 - INFO - Optimizer step 49: log_σ²=0.014482, weight=0.985622 2025-07-16 09:44:02,966 - INFO - Epoch 4: Total optimizer steps: 49 2025-07-16 09:47:00,793 - INFO - Validation metrics: 2025-07-16 09:47:00,794 - INFO - Loss: 0.7693 2025-07-16 09:47:00,794 - INFO - BCE Loss: 0.5501 2025-07-16 09:47:00,794 - INFO - Weighted BCE Loss: 0.5421 2025-07-16 09:47:00,794 - INFO - Average similarity: 0.6012 2025-07-16 09:47:00,794 - INFO - Median similarity: 0.6346 2025-07-16 09:47:00,794 - INFO - Clean sample similarity: 0.6012 2025-07-16 09:47:00,794 - INFO - Corrupted sample similarity: 0.4169 2025-07-16 09:47:00,794 - INFO - Similarity gap (clean - corrupt): 0.1843 2025-07-16 09:47:00,938 - INFO - Epoch 4/30 - Train Loss: 0.8162, Val Loss: 0.7693, Val BCE: 0.5501, Val wBCE: 0.5421, Clean Sim: 0.6012, Corrupt Sim: 0.4169, Gap: 0.1843, Time: 1262.67s 2025-07-16 09:47:00,938 - INFO - New best validation loss: 0.7693 2025-07-16 09:47:04,299 - INFO - New best similarity gap: 0.1843 2025-07-16 09:49:52,425 - INFO - Epoch 4 Validation Alignment: Pos=0.110, Neg=0.086, Gap=0.024 2025-07-16 09:50:25,215 - INFO - log_σ² gradient: -0.548225 2025-07-16 09:50:25,283 - INFO - Optimizer step 1: log_σ²=0.014628, weight=0.985479 2025-07-16 09:50:47,205 - INFO - log_σ² gradient: -0.545897 2025-07-16 09:50:47,277 - INFO - Optimizer step 2: log_σ²=0.014775, weight=0.985334 2025-07-16 09:51:08,408 - INFO - log_σ² gradient: -0.535924 2025-07-16 09:51:08,490 - INFO - Optimizer step 3: log_σ²=0.014923, weight=0.985188 2025-07-16 09:51:30,318 - INFO - log_σ² gradient: -0.540831 2025-07-16 09:51:30,396 - INFO - Optimizer step 4: log_σ²=0.015073, weight=0.985040 2025-07-16 09:51:50,569 - INFO - log_σ² gradient: -0.541919 2025-07-16 09:51:50,645 - INFO - Optimizer step 5: log_σ²=0.015223, weight=0.984892 2025-07-16 09:52:12,379 - INFO - log_σ² gradient: -0.545375 2025-07-16 09:52:12,449 - INFO - Optimizer step 6: log_σ²=0.015376, weight=0.984742 2025-07-16 09:52:33,668 - INFO - log_σ² gradient: -0.533545 2025-07-16 09:52:33,746 - INFO - Optimizer step 7: log_σ²=0.015529, weight=0.984591 2025-07-16 09:52:55,355 - INFO - log_σ² gradient: -0.531815 2025-07-16 09:52:55,436 - INFO - Optimizer step 8: log_σ²=0.015683, weight=0.984439 2025-07-16 09:53:16,128 - INFO - log_σ² gradient: -0.541757 2025-07-16 09:53:16,200 - INFO - Optimizer step 9: log_σ²=0.015839, weight=0.984286 2025-07-16 09:53:37,713 - INFO - log_σ² gradient: -0.539865 2025-07-16 09:53:37,785 - INFO - Optimizer step 10: log_σ²=0.015995, weight=0.984132 2025-07-16 09:54:00,122 - INFO - log_σ² gradient: -0.544010 2025-07-16 09:54:00,195 - INFO - Optimizer step 11: log_σ²=0.016153, weight=0.983977 2025-07-16 09:54:22,436 - INFO - log_σ² gradient: -0.530380 2025-07-16 09:54:22,511 - INFO - Optimizer step 12: log_σ²=0.016311, weight=0.983821 2025-07-16 09:54:43,387 - INFO - log_σ² gradient: -0.523923 2025-07-16 09:54:43,458 - INFO - Optimizer step 13: log_σ²=0.016470, weight=0.983665 2025-07-16 09:55:04,882 - INFO - log_σ² gradient: -0.530526 2025-07-16 09:55:04,955 - INFO - Optimizer step 14: log_σ²=0.016630, weight=0.983507 2025-07-16 09:55:28,254 - INFO - log_σ² gradient: -0.536887 2025-07-16 09:55:28,327 - INFO - Optimizer step 15: log_σ²=0.016791, weight=0.983349 2025-07-16 09:55:50,460 - INFO - log_σ² gradient: -0.532294 2025-07-16 09:55:50,533 - INFO - Optimizer step 16: log_σ²=0.016953, weight=0.983190 2025-07-16 09:56:12,432 - INFO - log_σ² gradient: -0.524873 2025-07-16 09:56:12,505 - INFO - Optimizer step 17: log_σ²=0.017115, weight=0.983031 2025-07-16 09:56:33,417 - INFO - log_σ² gradient: -0.522075 2025-07-16 09:56:33,488 - INFO - Optimizer step 18: log_σ²=0.017278, weight=0.982871 2025-07-16 09:56:55,028 - INFO - log_σ² gradient: -0.535281 2025-07-16 09:56:55,098 - INFO - Optimizer step 19: log_σ²=0.017441, weight=0.982710 2025-07-16 09:57:16,348 - INFO - log_σ² gradient: -0.538757 2025-07-16 09:57:16,415 - INFO - Optimizer step 20: log_σ²=0.017606, weight=0.982548 2025-07-16 09:57:38,897 - INFO - log_σ² gradient: -0.530159 2025-07-16 09:57:38,977 - INFO - Optimizer step 21: log_σ²=0.017772, weight=0.982385 2025-07-16 09:58:02,234 - INFO - log_σ² gradient: -0.532125 2025-07-16 09:58:02,320 - INFO - Optimizer step 22: log_σ²=0.017938, weight=0.982222 2025-07-16 09:58:24,365 - INFO - log_σ² gradient: -0.539036 2025-07-16 09:58:24,436 - INFO - Optimizer step 23: log_σ²=0.018106, weight=0.982057 2025-07-16 09:58:46,349 - INFO - log_σ² gradient: -0.530246 2025-07-16 09:58:46,421 - INFO - Optimizer step 24: log_σ²=0.018274, weight=0.981892 2025-07-16 09:59:07,223 - INFO - log_σ² gradient: -0.537864 2025-07-16 09:59:07,297 - INFO - Optimizer step 25: log_σ²=0.018443, weight=0.981726 2025-07-16 09:59:30,773 - INFO - log_σ² gradient: -0.531215 2025-07-16 09:59:30,845 - INFO - Optimizer step 26: log_σ²=0.018613, weight=0.981559 2025-07-16 09:59:52,015 - INFO - log_σ² gradient: -0.536691 2025-07-16 09:59:52,097 - INFO - Optimizer step 27: log_σ²=0.018784, weight=0.981391 2025-07-16 10:00:14,645 - INFO - log_σ² gradient: -0.530482 2025-07-16 10:00:14,716 - INFO - Optimizer step 28: log_σ²=0.018955, weight=0.981223 2025-07-16 10:00:37,506 - INFO - log_σ² gradient: -0.527557 2025-07-16 10:00:37,575 - INFO - Optimizer step 29: log_σ²=0.019128, weight=0.981054 2025-07-16 10:00:58,930 - INFO - log_σ² gradient: -0.530403 2025-07-16 10:00:59,008 - INFO - Optimizer step 30: log_σ²=0.019301, weight=0.980884 2025-07-16 10:01:21,340 - INFO - log_σ² gradient: -0.531647 2025-07-16 10:01:21,413 - INFO - Optimizer step 31: log_σ²=0.019475, weight=0.980714 2025-07-16 10:01:40,923 - INFO - log_σ² gradient: -0.524055 2025-07-16 10:01:40,991 - INFO - Optimizer step 32: log_σ²=0.019649, weight=0.980543 2025-07-16 10:02:03,664 - INFO - log_σ² gradient: -0.536347 2025-07-16 10:02:03,736 - INFO - Optimizer step 33: log_σ²=0.019824, weight=0.980371 2025-07-16 10:02:26,364 - INFO - log_σ² gradient: -0.525249 2025-07-16 10:02:26,437 - INFO - Optimizer step 34: log_σ²=0.020000, weight=0.980199 2025-07-16 10:02:48,517 - INFO - log_σ² gradient: -0.525609 2025-07-16 10:02:48,588 - INFO - Optimizer step 35: log_σ²=0.020177, weight=0.980026 2025-07-16 10:03:10,414 - INFO - log_σ² gradient: -0.544688 2025-07-16 10:03:10,488 - INFO - Optimizer step 36: log_σ²=0.020354, weight=0.979851 2025-07-16 10:03:34,253 - INFO - log_σ² gradient: -0.536496 2025-07-16 10:03:34,327 - INFO - Optimizer step 37: log_σ²=0.020533, weight=0.979676 2025-07-16 10:03:57,049 - INFO - log_σ² gradient: -0.523808 2025-07-16 10:03:57,128 - INFO - Optimizer step 38: log_σ²=0.020712, weight=0.979501 2025-07-16 10:04:18,806 - INFO - log_σ² gradient: -0.536923 2025-07-16 10:04:18,885 - INFO - Optimizer step 39: log_σ²=0.020893, weight=0.979324 2025-07-16 10:04:41,274 - INFO - log_σ² gradient: -0.540741 2025-07-16 10:04:41,345 - INFO - Optimizer step 40: log_σ²=0.021074, weight=0.979146 2025-07-16 10:05:03,460 - INFO - log_σ² gradient: -0.537178 2025-07-16 10:05:03,532 - INFO - Optimizer step 41: log_σ²=0.021256, weight=0.978968 2025-07-16 10:05:24,879 - INFO - log_σ² gradient: -0.538689 2025-07-16 10:05:24,951 - INFO - Optimizer step 42: log_σ²=0.021440, weight=0.978789 2025-07-16 10:05:47,298 - INFO - log_σ² gradient: -0.529676 2025-07-16 10:05:47,366 - INFO - Optimizer step 43: log_σ²=0.021624, weight=0.978609 2025-07-16 10:06:08,082 - INFO - log_σ² gradient: -0.542155 2025-07-16 10:06:08,160 - INFO - Optimizer step 44: log_σ²=0.021809, weight=0.978428 2025-07-16 10:06:30,694 - INFO - log_σ² gradient: -0.542068 2025-07-16 10:06:30,760 - INFO - Optimizer step 45: log_σ²=0.021995, weight=0.978245 2025-07-16 10:06:52,955 - INFO - log_σ² gradient: -0.544074 2025-07-16 10:06:53,030 - INFO - Optimizer step 46: log_σ²=0.022182, weight=0.978062 2025-07-16 10:07:14,756 - INFO - log_σ² gradient: -0.529234 2025-07-16 10:07:14,826 - INFO - Optimizer step 47: log_σ²=0.022370, weight=0.977879 2025-07-16 10:07:37,261 - INFO - log_σ² gradient: -0.531332 2025-07-16 10:07:37,339 - INFO - Optimizer step 48: log_σ²=0.022558, weight=0.977695 2025-07-16 10:07:47,663 - INFO - log_σ² gradient: -0.243481 2025-07-16 10:07:47,741 - INFO - Optimizer step 49: log_σ²=0.022737, weight=0.977519 2025-07-16 10:07:47,918 - INFO - Epoch 5: Total optimizer steps: 49 2025-07-16 10:10:45,884 - INFO - Validation metrics: 2025-07-16 10:10:45,884 - INFO - Loss: 0.7356 2025-07-16 10:10:45,884 - INFO - BCE Loss: 0.5313 2025-07-16 10:10:45,884 - INFO - Weighted BCE Loss: 0.5194 2025-07-16 10:10:45,884 - INFO - Average similarity: 0.6801 2025-07-16 10:10:45,884 - INFO - Median similarity: 0.7201 2025-07-16 10:10:45,884 - INFO - Clean sample similarity: 0.6801 2025-07-16 10:10:45,884 - INFO - Corrupted sample similarity: 0.4744 2025-07-16 10:10:45,884 - INFO - Similarity gap (clean - corrupt): 0.2057 2025-07-16 10:10:46,090 - INFO - Epoch 5/30 - Train Loss: 0.7962, Val Loss: 0.7356, Val BCE: 0.5313, Val wBCE: 0.5194, Clean Sim: 0.6801, Corrupt Sim: 0.4744, Gap: 0.2057, Time: 1253.66s 2025-07-16 10:10:46,090 - INFO - New best validation loss: 0.7356 2025-07-16 10:10:48,749 - INFO - New best similarity gap: 0.2057 2025-07-16 10:11:22,840 - INFO - log_σ² gradient: -0.527686 2025-07-16 10:11:22,913 - INFO - Optimizer step 1: log_σ²=0.022918, weight=0.977343 2025-07-16 10:11:45,006 - INFO - log_σ² gradient: -0.525835 2025-07-16 10:11:45,076 - INFO - Optimizer step 2: log_σ²=0.023100, weight=0.977165 2025-07-16 10:12:07,843 - INFO - log_σ² gradient: -0.537649 2025-07-16 10:12:07,916 - INFO - Optimizer step 3: log_σ²=0.023284, weight=0.976985 2025-07-16 10:12:30,059 - INFO - log_σ² gradient: -0.527435 2025-07-16 10:12:30,131 - INFO - Optimizer step 4: log_σ²=0.023469, weight=0.976804 2025-07-16 10:12:50,552 - INFO - log_σ² gradient: -0.527576 2025-07-16 10:12:50,627 - INFO - Optimizer step 5: log_σ²=0.023656, weight=0.976622 2025-07-16 10:13:10,990 - INFO - log_σ² gradient: -0.520578 2025-07-16 10:13:11,068 - INFO - Optimizer step 6: log_σ²=0.023843, weight=0.976439 2025-07-16 10:13:32,998 - INFO - log_σ² gradient: -0.527133 2025-07-16 10:13:33,073 - INFO - Optimizer step 7: log_σ²=0.024032, weight=0.976254 2025-07-16 10:13:53,986 - INFO - log_σ² gradient: -0.539161 2025-07-16 10:13:54,059 - INFO - Optimizer step 8: log_σ²=0.024222, weight=0.976069 2025-07-16 10:14:15,967 - INFO - log_σ² gradient: -0.524393 2025-07-16 10:14:16,040 - INFO - Optimizer step 9: log_σ²=0.024414, weight=0.975882 2025-07-16 10:14:38,234 - INFO - log_σ² gradient: -0.529987 2025-07-16 10:14:38,306 - INFO - Optimizer step 10: log_σ²=0.024606, weight=0.975694 2025-07-16 10:14:59,845 - INFO - log_σ² gradient: -0.534637 2025-07-16 10:14:59,915 - INFO - Optimizer step 11: log_σ²=0.024800, weight=0.975505 2025-07-16 10:15:22,045 - INFO - log_σ² gradient: -0.532029 2025-07-16 10:15:22,123 - INFO - Optimizer step 12: log_σ²=0.024995, weight=0.975315 2025-07-16 10:15:44,313 - INFO - log_σ² gradient: -0.520245 2025-07-16 10:15:44,387 - INFO - Optimizer step 13: log_σ²=0.025190, weight=0.975124 2025-07-16 10:16:07,475 - INFO - log_σ² gradient: -0.532786 2025-07-16 10:16:07,545 - INFO - Optimizer step 14: log_σ²=0.025387, weight=0.974933 2025-07-16 10:16:29,824 - INFO - log_σ² gradient: -0.531132 2025-07-16 10:16:29,897 - INFO - Optimizer step 15: log_σ²=0.025585, weight=0.974740 2025-07-16 10:16:50,768 - INFO - log_σ² gradient: -0.533583 2025-07-16 10:16:50,839 - INFO - Optimizer step 16: log_σ²=0.025784, weight=0.974546 2025-07-16 10:17:14,069 - INFO - log_σ² gradient: -0.522190 2025-07-16 10:17:14,151 - INFO - Optimizer step 17: log_σ²=0.025983, weight=0.974351 2025-07-16 10:17:35,836 - INFO - log_σ² gradient: -0.528222 2025-07-16 10:17:35,906 - INFO - Optimizer step 18: log_σ²=0.026184, weight=0.974156 2025-07-16 10:17:58,086 - INFO - log_σ² gradient: -0.536138 2025-07-16 10:17:58,163 - INFO - Optimizer step 19: log_σ²=0.026386, weight=0.973960 2025-07-16 10:18:21,070 - INFO - log_σ² gradient: -0.532396 2025-07-16 10:18:21,143 - INFO - Optimizer step 20: log_σ²=0.026588, weight=0.973762 2025-07-16 10:18:42,474 - INFO - log_σ² gradient: -0.522873 2025-07-16 10:18:42,547 - INFO - Optimizer step 21: log_σ²=0.026792, weight=0.973564 2025-07-16 10:19:03,166 - INFO - log_σ² gradient: -0.539955 2025-07-16 10:19:03,239 - INFO - Optimizer step 22: log_σ²=0.026996, weight=0.973365 2025-07-16 10:19:24,831 - INFO - log_σ² gradient: -0.528781 2025-07-16 10:19:24,901 - INFO - Optimizer step 23: log_σ²=0.027202, weight=0.973165 2025-07-16 10:19:47,259 - INFO - log_σ² gradient: -0.525532 2025-07-16 10:19:47,331 - INFO - Optimizer step 24: log_σ²=0.027408, weight=0.972964 2025-07-16 10:20:09,621 - INFO - log_σ² gradient: -0.518871 2025-07-16 10:20:09,699 - INFO - Optimizer step 25: log_σ²=0.027615, weight=0.972763 2025-07-16 10:20:32,621 - INFO - log_σ² gradient: -0.535395 2025-07-16 10:20:32,699 - INFO - Optimizer step 26: log_σ²=0.027822, weight=0.972561 2025-07-16 10:20:54,374 - INFO - log_σ² gradient: -0.526953 2025-07-16 10:20:54,452 - INFO - Optimizer step 27: log_σ²=0.028031, weight=0.972358 2025-07-16 10:21:15,629 - INFO - log_σ² gradient: -0.524942 2025-07-16 10:21:15,697 - INFO - Optimizer step 28: log_σ²=0.028240, weight=0.972155 2025-07-16 10:21:36,484 - INFO - log_σ² gradient: -0.526149 2025-07-16 10:21:36,561 - INFO - Optimizer step 29: log_σ²=0.028450, weight=0.971951 2025-07-16 10:22:00,257 - INFO - log_σ² gradient: -0.529212 2025-07-16 10:22:00,327 - INFO - Optimizer step 30: log_σ²=0.028661, weight=0.971746 2025-07-16 10:22:22,435 - INFO - log_σ² gradient: -0.515584 2025-07-16 10:22:22,509 - INFO - Optimizer step 31: log_σ²=0.028872, weight=0.971540 2025-07-16 10:22:44,343 - INFO - log_σ² gradient: -0.523682 2025-07-16 10:22:44,413 - INFO - Optimizer step 32: log_σ²=0.029084, weight=0.971335 2025-07-16 10:23:05,069 - INFO - log_σ² gradient: -0.518722 2025-07-16 10:23:05,147 - INFO - Optimizer step 33: log_σ²=0.029297, weight=0.971128 2025-07-16 10:23:27,139 - INFO - log_σ² gradient: -0.524254 2025-07-16 10:23:27,211 - INFO - Optimizer step 34: log_σ²=0.029510, weight=0.970921 2025-07-16 10:23:49,296 - INFO - log_σ² gradient: -0.524826 2025-07-16 10:23:49,366 - INFO - Optimizer step 35: log_σ²=0.029724, weight=0.970713 2025-07-16 10:24:11,124 - INFO - log_σ² gradient: -0.521474 2025-07-16 10:24:11,196 - INFO - Optimizer step 36: log_σ²=0.029939, weight=0.970505 2025-07-16 10:24:33,494 - INFO - log_σ² gradient: -0.517238 2025-07-16 10:24:33,564 - INFO - Optimizer step 37: log_σ²=0.030154, weight=0.970296 2025-07-16 10:24:56,681 - INFO - log_σ² gradient: -0.520840 2025-07-16 10:24:56,758 - INFO - Optimizer step 38: log_σ²=0.030370, weight=0.970087 2025-07-16 10:25:19,142 - INFO - log_σ² gradient: -0.508262 2025-07-16 10:25:19,214 - INFO - Optimizer step 39: log_σ²=0.030586, weight=0.969877 2025-07-16 10:25:41,844 - INFO - log_σ² gradient: -0.529377 2025-07-16 10:25:41,923 - INFO - Optimizer step 40: log_σ²=0.030803, weight=0.969667 2025-07-16 10:26:03,386 - INFO - log_σ² gradient: -0.529147 2025-07-16 10:26:03,443 - INFO - Optimizer step 41: log_σ²=0.031021, weight=0.969455 2025-07-16 10:26:23,775 - INFO - log_σ² gradient: -0.529861 2025-07-16 10:26:23,847 - INFO - Optimizer step 42: log_σ²=0.031240, weight=0.969243 2025-07-16 10:26:47,122 - INFO - log_σ² gradient: -0.528226 2025-07-16 10:26:47,195 - INFO - Optimizer step 43: log_σ²=0.031460, weight=0.969029 2025-07-16 10:27:08,657 - INFO - log_σ² gradient: -0.527973 2025-07-16 10:27:08,730 - INFO - Optimizer step 44: log_σ²=0.031681, weight=0.968815 2025-07-16 10:27:29,688 - INFO - log_σ² gradient: -0.528014 2025-07-16 10:27:29,763 - INFO - Optimizer step 45: log_σ²=0.031903, weight=0.968600 2025-07-16 10:27:52,804 - INFO - log_σ² gradient: -0.524062 2025-07-16 10:27:52,876 - INFO - Optimizer step 46: log_σ²=0.032126, weight=0.968384 2025-07-16 10:28:15,883 - INFO - log_σ² gradient: -0.521974 2025-07-16 10:28:15,954 - INFO - Optimizer step 47: log_σ²=0.032350, weight=0.968168 2025-07-16 10:28:39,659 - INFO - log_σ² gradient: -0.530359 2025-07-16 10:28:39,734 - INFO - Optimizer step 48: log_σ²=0.032574, weight=0.967951 2025-07-16 10:28:49,717 - INFO - log_σ² gradient: -0.254091 2025-07-16 10:28:49,787 - INFO - Optimizer step 49: log_σ²=0.032788, weight=0.967743 2025-07-16 10:28:49,987 - INFO - Epoch 6: Total optimizer steps: 49 2025-07-16 10:31:54,758 - INFO - Validation metrics: 2025-07-16 10:31:54,758 - INFO - Loss: 0.7193 2025-07-16 10:31:54,758 - INFO - BCE Loss: 0.5264 2025-07-16 10:31:54,758 - INFO - Weighted BCE Loss: 0.5094 2025-07-16 10:31:54,758 - INFO - Average similarity: 0.6862 2025-07-16 10:31:54,758 - INFO - Median similarity: 0.7224 2025-07-16 10:31:54,758 - INFO - Clean sample similarity: 0.6862 2025-07-16 10:31:54,758 - INFO - Corrupted sample similarity: 0.4718 2025-07-16 10:31:54,758 - INFO - Similarity gap (clean - corrupt): 0.2145 2025-07-16 10:31:54,890 - INFO - Epoch 6/30 - Train Loss: 0.7666, Val Loss: 0.7193, Val BCE: 0.5264, Val wBCE: 0.5094, Clean Sim: 0.6862, Corrupt Sim: 0.4718, Gap: 0.2145, Time: 1262.78s 2025-07-16 10:31:54,890 - INFO - New best validation loss: 0.7193 2025-07-16 10:31:58,630 - INFO - New best similarity gap: 0.2145 2025-07-16 10:34:51,441 - INFO - Epoch 6 Validation Alignment: Pos=0.175, Neg=0.135, Gap=0.040 2025-07-16 10:35:27,876 - INFO - log_σ² gradient: -0.525192 2025-07-16 10:35:27,947 - INFO - Optimizer step 1: log_σ²=0.033004, weight=0.967535 2025-07-16 10:35:50,479 - INFO - log_σ² gradient: -0.533448 2025-07-16 10:35:50,557 - INFO - Optimizer step 2: log_σ²=0.033222, weight=0.967324 2025-07-16 10:36:13,926 - INFO - log_σ² gradient: -0.520257 2025-07-16 10:36:14,004 - INFO - Optimizer step 3: log_σ²=0.033442, weight=0.967111 2025-07-16 10:36:36,674 - INFO - log_σ² gradient: -0.528589 2025-07-16 10:36:36,746 - INFO - Optimizer step 4: log_σ²=0.033663, weight=0.966897 2025-07-16 10:37:00,637 - INFO - log_σ² gradient: -0.528008 2025-07-16 10:37:00,712 - INFO - Optimizer step 5: log_σ²=0.033886, weight=0.966682 2025-07-16 10:37:22,991 - INFO - log_σ² gradient: -0.522759 2025-07-16 10:37:23,065 - INFO - Optimizer step 6: log_σ²=0.034110, weight=0.966465 2025-07-16 10:37:46,060 - INFO - log_σ² gradient: -0.519921 2025-07-16 10:37:46,134 - INFO - Optimizer step 7: log_σ²=0.034336, weight=0.966247 2025-07-16 10:38:08,363 - INFO - log_σ² gradient: -0.532761 2025-07-16 10:38:08,433 - INFO - Optimizer step 8: log_σ²=0.034563, weight=0.966028 2025-07-16 10:38:32,247 - INFO - log_σ² gradient: -0.528744 2025-07-16 10:38:32,314 - INFO - Optimizer step 9: log_σ²=0.034791, weight=0.965807 2025-07-16 10:38:54,238 - INFO - log_σ² gradient: -0.528719 2025-07-16 10:38:54,308 - INFO - Optimizer step 10: log_σ²=0.035021, weight=0.965585 2025-07-16 10:39:17,295 - INFO - log_σ² gradient: -0.529162 2025-07-16 10:39:17,366 - INFO - Optimizer step 11: log_σ²=0.035252, weight=0.965362 2025-07-16 10:39:40,653 - INFO - log_σ² gradient: -0.518953 2025-07-16 10:39:40,726 - INFO - Optimizer step 12: log_σ²=0.035485, weight=0.965138 2025-07-16 10:40:02,405 - INFO - log_σ² gradient: -0.524532 2025-07-16 10:40:02,480 - INFO - Optimizer step 13: log_σ²=0.035718, weight=0.964913 2025-07-16 10:40:25,074 - INFO - log_σ² gradient: -0.527185 2025-07-16 10:40:25,147 - INFO - Optimizer step 14: log_σ²=0.035952, weight=0.964687 2025-07-16 10:40:48,054 - INFO - log_σ² gradient: -0.535090 2025-07-16 10:40:48,122 - INFO - Optimizer step 15: log_σ²=0.036188, weight=0.964459 2025-07-16 10:41:09,323 - INFO - log_σ² gradient: -0.528083 2025-07-16 10:41:09,400 - INFO - Optimizer step 16: log_σ²=0.036425, weight=0.964231 2025-07-16 10:41:32,531 - INFO - log_σ² gradient: -0.510952 2025-07-16 10:41:32,605 - INFO - Optimizer step 17: log_σ²=0.036662, weight=0.964002 2025-07-16 10:41:56,286 - INFO - log_σ² gradient: -0.525253 2025-07-16 10:41:56,361 - INFO - Optimizer step 18: log_σ²=0.036900, weight=0.963772 2025-07-16 10:42:17,954 - INFO - log_σ² gradient: -0.524274 2025-07-16 10:42:18,027 - INFO - Optimizer step 19: log_σ²=0.037139, weight=0.963542 2025-07-16 10:42:41,058 - INFO - log_σ² gradient: -0.516717 2025-07-16 10:42:41,130 - INFO - Optimizer step 20: log_σ²=0.037379, weight=0.963311 2025-07-16 10:43:12,732 - INFO - log_σ² gradient: -0.524552 2025-07-16 10:43:12,806 - INFO - Optimizer step 21: log_σ²=0.037620, weight=0.963079 2025-07-16 10:43:37,107 - INFO - log_σ² gradient: -0.509323 2025-07-16 10:43:37,179 - INFO - Optimizer step 22: log_σ²=0.037860, weight=0.962847 2025-07-16 10:44:00,228 - INFO - log_σ² gradient: -0.528261 2025-07-16 10:44:00,302 - INFO - Optimizer step 23: log_σ²=0.038103, weight=0.962614 2025-07-16 10:44:22,843 - INFO - log_σ² gradient: -0.530425 2025-07-16 10:44:22,915 - INFO - Optimizer step 24: log_σ²=0.038346, weight=0.962380 2025-07-16 10:44:45,382 - INFO - log_σ² gradient: -0.518938 2025-07-16 10:44:45,463 - INFO - Optimizer step 25: log_σ²=0.038590, weight=0.962145 2025-07-16 10:45:07,159 - INFO - log_σ² gradient: -0.527693 2025-07-16 10:45:07,230 - INFO - Optimizer step 26: log_σ²=0.038835, weight=0.961910 2025-07-16 10:45:29,315 - INFO - log_σ² gradient: -0.528341 2025-07-16 10:45:29,389 - INFO - Optimizer step 27: log_σ²=0.039081, weight=0.961673 2025-07-16 10:45:52,506 - INFO - log_σ² gradient: -0.522054 2025-07-16 10:45:52,576 - INFO - Optimizer step 28: log_σ²=0.039328, weight=0.961435 2025-07-16 10:46:14,978 - INFO - log_σ² gradient: -0.522763 2025-07-16 10:46:15,053 - INFO - Optimizer step 29: log_σ²=0.039576, weight=0.961197 2025-07-16 10:46:37,917 - INFO - log_σ² gradient: -0.523379 2025-07-16 10:46:37,987 - INFO - Optimizer step 30: log_σ²=0.039824, weight=0.960958 2025-07-16 10:47:00,545 - INFO - log_σ² gradient: -0.527123 2025-07-16 10:47:00,615 - INFO - Optimizer step 31: log_σ²=0.040074, weight=0.960719 2025-07-16 10:47:24,439 - INFO - log_σ² gradient: -0.521467 2025-07-16 10:47:24,520 - INFO - Optimizer step 32: log_σ²=0.040324, weight=0.960478 2025-07-16 10:47:47,788 - INFO - log_σ² gradient: -0.516979 2025-07-16 10:47:47,858 - INFO - Optimizer step 33: log_σ²=0.040575, weight=0.960238 2025-07-16 10:48:10,714 - INFO - log_σ² gradient: -0.519033 2025-07-16 10:48:10,787 - INFO - Optimizer step 34: log_σ²=0.040826, weight=0.959996 2025-07-16 10:48:33,298 - INFO - log_σ² gradient: -0.518464 2025-07-16 10:48:33,368 - INFO - Optimizer step 35: log_σ²=0.041078, weight=0.959754 2025-07-16 10:48:55,928 - INFO - log_σ² gradient: -0.523261 2025-07-16 10:48:56,000 - INFO - Optimizer step 36: log_σ²=0.041331, weight=0.959512 2025-07-16 10:49:18,237 - INFO - log_σ² gradient: -0.527285 2025-07-16 10:49:18,309 - INFO - Optimizer step 37: log_σ²=0.041585, weight=0.959268 2025-07-16 10:49:41,368 - INFO - log_σ² gradient: -0.523193 2025-07-16 10:49:41,437 - INFO - Optimizer step 38: log_σ²=0.041840, weight=0.959023 2025-07-16 10:50:02,798 - INFO - log_σ² gradient: -0.529500 2025-07-16 10:50:02,872 - INFO - Optimizer step 39: log_σ²=0.042096, weight=0.958778 2025-07-16 10:50:27,329 - INFO - log_σ² gradient: -0.517405 2025-07-16 10:50:27,402 - INFO - Optimizer step 40: log_σ²=0.042352, weight=0.958532 2025-07-16 10:50:50,420 - INFO - log_σ² gradient: -0.517722 2025-07-16 10:50:50,493 - INFO - Optimizer step 41: log_σ²=0.042609, weight=0.958286 2025-07-16 10:51:14,145 - INFO - log_σ² gradient: -0.510710 2025-07-16 10:51:14,218 - INFO - Optimizer step 42: log_σ²=0.042866, weight=0.958039 2025-07-16 10:51:37,877 - INFO - log_σ² gradient: -0.519447 2025-07-16 10:51:37,948 - INFO - Optimizer step 43: log_σ²=0.043124, weight=0.957792 2025-07-16 10:52:00,741 - INFO - log_σ² gradient: -0.520599 2025-07-16 10:52:00,811 - INFO - Optimizer step 44: log_σ²=0.043383, weight=0.957544 2025-07-16 10:52:24,054 - INFO - log_σ² gradient: -0.519932 2025-07-16 10:52:24,132 - INFO - Optimizer step 45: log_σ²=0.043643, weight=0.957296 2025-07-16 10:52:45,730 - INFO - log_σ² gradient: -0.516737 2025-07-16 10:52:45,803 - INFO - Optimizer step 46: log_σ²=0.043903, weight=0.957047 2025-07-16 10:53:09,569 - INFO - log_σ² gradient: -0.513557 2025-07-16 10:53:09,640 - INFO - Optimizer step 47: log_σ²=0.044163, weight=0.956798 2025-07-16 10:53:32,885 - INFO - log_σ² gradient: -0.521306 2025-07-16 10:53:32,952 - INFO - Optimizer step 48: log_σ²=0.044425, weight=0.956547 2025-07-16 10:53:42,994 - INFO - log_σ² gradient: -0.242698 2025-07-16 10:53:43,071 - INFO - Optimizer step 49: log_σ²=0.044674, weight=0.956310 2025-07-16 10:53:43,252 - INFO - Epoch 7: Total optimizer steps: 49 2025-07-16 10:56:43,691 - INFO - Validation metrics: 2025-07-16 10:56:43,691 - INFO - Loss: 0.6940 2025-07-16 10:56:43,691 - INFO - BCE Loss: 0.5152 2025-07-16 10:56:43,691 - INFO - Weighted BCE Loss: 0.4927 2025-07-16 10:56:43,691 - INFO - Average similarity: 0.6816 2025-07-16 10:56:43,691 - INFO - Median similarity: 0.7055 2025-07-16 10:56:43,691 - INFO - Clean sample similarity: 0.6816 2025-07-16 10:56:43,691 - INFO - Corrupted sample similarity: 0.4603 2025-07-16 10:56:43,691 - INFO - Similarity gap (clean - corrupt): 0.2212 2025-07-16 10:56:43,879 - INFO - Epoch 7/30 - Train Loss: 0.7523, Val Loss: 0.6940, Val BCE: 0.5152, Val wBCE: 0.4927, Clean Sim: 0.6816, Corrupt Sim: 0.4603, Gap: 0.2212, Time: 1312.44s 2025-07-16 10:56:43,879 - INFO - New best validation loss: 0.6940 2025-07-16 10:56:47,349 - INFO - New best similarity gap: 0.2212 2025-07-16 10:57:24,119 - INFO - log_σ² gradient: -0.509232 2025-07-16 10:57:24,197 - INFO - Optimizer step 1: log_σ²=0.044924, weight=0.956070 2025-07-16 10:57:48,426 - INFO - log_σ² gradient: -0.513411 2025-07-16 10:57:48,498 - INFO - Optimizer step 2: log_σ²=0.045176, weight=0.955829 2025-07-16 10:58:12,397 - INFO - log_σ² gradient: -0.515256 2025-07-16 10:58:12,469 - INFO - Optimizer step 3: log_σ²=0.045430, weight=0.955587 2025-07-16 10:58:35,023 - INFO - log_σ² gradient: -0.518528 2025-07-16 10:58:35,094 - INFO - Optimizer step 4: log_σ²=0.045685, weight=0.955342 2025-07-16 10:58:57,728 - INFO - log_σ² gradient: -0.517068 2025-07-16 10:58:57,801 - INFO - Optimizer step 5: log_σ²=0.045943, weight=0.955097 2025-07-16 10:59:20,661 - INFO - log_σ² gradient: -0.510404 2025-07-16 10:59:20,733 - INFO - Optimizer step 6: log_σ²=0.046201, weight=0.954850 2025-07-16 10:59:41,195 - INFO - log_σ² gradient: -0.516337 2025-07-16 10:59:41,262 - INFO - Optimizer step 7: log_σ²=0.046461, weight=0.954601 2025-07-16 11:00:03,387 - INFO - log_σ² gradient: -0.526070 2025-07-16 11:00:03,464 - INFO - Optimizer step 8: log_σ²=0.046723, weight=0.954351 2025-07-16 11:00:25,182 - INFO - log_σ² gradient: -0.509605 2025-07-16 11:00:25,253 - INFO - Optimizer step 9: log_σ²=0.046986, weight=0.954101 2025-07-16 11:00:49,182 - INFO - log_σ² gradient: -0.514848 2025-07-16 11:00:49,257 - INFO - Optimizer step 10: log_σ²=0.047250, weight=0.953848 2025-07-16 11:01:11,347 - INFO - log_σ² gradient: -0.506525 2025-07-16 11:01:11,426 - INFO - Optimizer step 11: log_σ²=0.047515, weight=0.953596 2025-07-16 11:01:32,806 - INFO - log_σ² gradient: -0.511995 2025-07-16 11:01:32,873 - INFO - Optimizer step 12: log_σ²=0.047781, weight=0.953342 2025-07-16 11:01:55,186 - INFO - log_σ² gradient: -0.516972 2025-07-16 11:01:55,254 - INFO - Optimizer step 13: log_σ²=0.048049, weight=0.953087 2025-07-16 11:02:17,075 - INFO - log_σ² gradient: -0.510252 2025-07-16 11:02:17,139 - INFO - Optimizer step 14: log_σ²=0.048317, weight=0.952832 2025-07-16 11:02:39,438 - INFO - log_σ² gradient: -0.520853 2025-07-16 11:02:39,512 - INFO - Optimizer step 15: log_σ²=0.048586, weight=0.952575 2025-07-16 11:03:01,158 - INFO - log_σ² gradient: -0.508632 2025-07-16 11:03:01,232 - INFO - Optimizer step 16: log_σ²=0.048857, weight=0.952318 2025-07-16 11:03:24,338 - INFO - log_σ² gradient: -0.517552 2025-07-16 11:03:24,415 - INFO - Optimizer step 17: log_σ²=0.049128, weight=0.952059 2025-07-16 11:03:47,206 - INFO - log_σ² gradient: -0.513232 2025-07-16 11:03:47,281 - INFO - Optimizer step 18: log_σ²=0.049401, weight=0.951800 2025-07-16 11:04:09,487 - INFO - log_σ² gradient: -0.520199 2025-07-16 11:04:09,562 - INFO - Optimizer step 19: log_σ²=0.049674, weight=0.951539 2025-07-16 11:04:31,230 - INFO - log_σ² gradient: -0.514183 2025-07-16 11:04:31,302 - INFO - Optimizer step 20: log_σ²=0.049949, weight=0.951278 2025-07-16 11:04:53,880 - INFO - log_σ² gradient: -0.524598 2025-07-16 11:04:53,949 - INFO - Optimizer step 21: log_σ²=0.050225, weight=0.951015 2025-07-16 11:05:18,294 - INFO - log_σ² gradient: -0.512512 2025-07-16 11:05:18,375 - INFO - Optimizer step 22: log_σ²=0.050502, weight=0.950752 2025-07-16 11:05:40,290 - INFO - log_σ² gradient: -0.513880 2025-07-16 11:05:40,360 - INFO - Optimizer step 23: log_σ²=0.050780, weight=0.950488 2025-07-16 11:06:02,916 - INFO - log_σ² gradient: -0.510354 2025-07-16 11:06:02,985 - INFO - Optimizer step 24: log_σ²=0.051058, weight=0.950223 2025-07-16 11:06:25,957 - INFO - log_σ² gradient: -0.520341 2025-07-16 11:06:26,034 - INFO - Optimizer step 25: log_σ²=0.051338, weight=0.949958 2025-07-16 11:06:48,755 - INFO - log_σ² gradient: -0.521423 2025-07-16 11:06:48,823 - INFO - Optimizer step 26: log_σ²=0.051619, weight=0.949691 2025-07-16 11:07:12,139 - INFO - log_σ² gradient: -0.515343 2025-07-16 11:07:12,213 - INFO - Optimizer step 27: log_σ²=0.051900, weight=0.949423 2025-07-16 11:07:36,382 - INFO - log_σ² gradient: -0.514419 2025-07-16 11:07:36,462 - INFO - Optimizer step 28: log_σ²=0.052183, weight=0.949155 2025-07-16 11:08:00,352 - INFO - log_σ² gradient: -0.510582 2025-07-16 11:08:00,421 - INFO - Optimizer step 29: log_σ²=0.052466, weight=0.948887 2025-07-16 11:08:23,791 - INFO - log_σ² gradient: -0.509405 2025-07-16 11:08:23,865 - INFO - Optimizer step 30: log_σ²=0.052749, weight=0.948618 2025-07-16 11:08:47,925 - INFO - log_σ² gradient: -0.501882 2025-07-16 11:08:47,998 - INFO - Optimizer step 31: log_σ²=0.053033, weight=0.948349 2025-07-16 11:09:10,277 - INFO - log_σ² gradient: -0.498140 2025-07-16 11:09:10,347 - INFO - Optimizer step 32: log_σ²=0.053317, weight=0.948080 2025-07-16 11:09:33,635 - INFO - log_σ² gradient: -0.500281 2025-07-16 11:09:33,709 - INFO - Optimizer step 33: log_σ²=0.053601, weight=0.947810 2025-07-16 11:09:57,856 - INFO - log_σ² gradient: -0.509929 2025-07-16 11:09:57,928 - INFO - Optimizer step 34: log_σ²=0.053886, weight=0.947541 2025-07-16 11:10:21,485 - INFO - log_σ² gradient: -0.500750 2025-07-16 11:10:21,561 - INFO - Optimizer step 35: log_σ²=0.054171, weight=0.947270 2025-07-16 11:10:44,685 - INFO - log_σ² gradient: -0.516687 2025-07-16 11:10:44,754 - INFO - Optimizer step 36: log_σ²=0.054457, weight=0.946999 2025-07-16 11:11:07,022 - INFO - log_σ² gradient: -0.506898 2025-07-16 11:11:07,099 - INFO - Optimizer step 37: log_σ²=0.054744, weight=0.946728 2025-07-16 11:11:30,299 - INFO - log_σ² gradient: -0.506652 2025-07-16 11:11:30,369 - INFO - Optimizer step 38: log_σ²=0.055032, weight=0.946455 2025-07-16 11:11:52,389 - INFO - log_σ² gradient: -0.505728 2025-07-16 11:11:52,463 - INFO - Optimizer step 39: log_σ²=0.055320, weight=0.946183 2025-07-16 11:12:15,592 - INFO - log_σ² gradient: -0.501807 2025-07-16 11:12:15,661 - INFO - Optimizer step 40: log_σ²=0.055608, weight=0.945910 2025-07-16 11:12:37,899 - INFO - log_σ² gradient: -0.507822 2025-07-16 11:12:37,971 - INFO - Optimizer step 41: log_σ²=0.055898, weight=0.945636 2025-07-16 11:13:00,690 - INFO - log_σ² gradient: -0.502860 2025-07-16 11:13:00,764 - INFO - Optimizer step 42: log_σ²=0.056188, weight=0.945362 2025-07-16 11:13:22,724 - INFO - log_σ² gradient: -0.509156 2025-07-16 11:13:22,796 - INFO - Optimizer step 43: log_σ²=0.056478, weight=0.945087 2025-07-16 11:13:46,023 - INFO - log_σ² gradient: -0.506523 2025-07-16 11:13:46,093 - INFO - Optimizer step 44: log_σ²=0.056770, weight=0.944811 2025-07-16 11:14:08,812 - INFO - log_σ² gradient: -0.503488 2025-07-16 11:14:08,884 - INFO - Optimizer step 45: log_σ²=0.057062, weight=0.944535 2025-07-16 11:14:30,786 - INFO - log_σ² gradient: -0.510061 2025-07-16 11:14:30,863 - INFO - Optimizer step 46: log_σ²=0.057355, weight=0.944259 2025-07-16 11:14:53,871 - INFO - log_σ² gradient: -0.514276 2025-07-16 11:14:53,944 - INFO - Optimizer step 47: log_σ²=0.057649, weight=0.943981 2025-07-16 11:15:15,687 - INFO - log_σ² gradient: -0.506356 2025-07-16 11:15:15,757 - INFO - Optimizer step 48: log_σ²=0.057944, weight=0.943703 2025-07-16 11:15:26,756 - INFO - log_σ² gradient: -0.237399 2025-07-16 11:15:26,843 - INFO - Optimizer step 49: log_σ²=0.058225, weight=0.943438 2025-07-16 11:15:27,082 - INFO - Epoch 8: Total optimizer steps: 49 2025-07-16 11:18:27,392 - INFO - Validation metrics: 2025-07-16 11:18:27,393 - INFO - Loss: 0.6643 2025-07-16 11:18:27,393 - INFO - BCE Loss: 0.5075 2025-07-16 11:18:27,393 - INFO - Weighted BCE Loss: 0.4788 2025-07-16 11:18:27,393 - INFO - Average similarity: 0.6924 2025-07-16 11:18:27,393 - INFO - Median similarity: 0.7318 2025-07-16 11:18:27,393 - INFO - Clean sample similarity: 0.6924 2025-07-16 11:18:27,393 - INFO - Corrupted sample similarity: 0.4105 2025-07-16 11:18:27,393 - INFO - Similarity gap (clean - corrupt): 0.2819 2025-07-16 11:18:27,582 - INFO - Epoch 8/30 - Train Loss: 0.7218, Val Loss: 0.6643, Val BCE: 0.5075, Val wBCE: 0.4788, Clean Sim: 0.6924, Corrupt Sim: 0.4105, Gap: 0.2819, Time: 1296.82s 2025-07-16 11:18:27,582 - INFO - New best validation loss: 0.6643 2025-07-16 11:18:31,075 - INFO - New best similarity gap: 0.2819 2025-07-16 11:21:20,725 - INFO - Epoch 8 Validation Alignment: Pos=0.169, Neg=0.111, Gap=0.058 2025-07-16 11:21:54,438 - INFO - log_σ² gradient: -0.500408 2025-07-16 11:21:54,513 - INFO - Optimizer step 1: log_σ²=0.058507, weight=0.943172 2025-07-16 11:22:16,650 - INFO - log_σ² gradient: -0.504298 2025-07-16 11:22:16,723 - INFO - Optimizer step 2: log_σ²=0.058791, weight=0.942904 2025-07-16 11:22:39,157 - INFO - log_σ² gradient: -0.500099 2025-07-16 11:22:39,224 - INFO - Optimizer step 3: log_σ²=0.059077, weight=0.942634 2025-07-16 11:23:01,345 - INFO - log_σ² gradient: -0.510107 2025-07-16 11:23:01,418 - INFO - Optimizer step 4: log_σ²=0.059365, weight=0.942363 2025-07-16 11:23:25,210 - INFO - log_σ² gradient: -0.505759 2025-07-16 11:23:25,279 - INFO - Optimizer step 5: log_σ²=0.059655, weight=0.942090 2025-07-16 11:23:48,581 - INFO - log_σ² gradient: -0.506803 2025-07-16 11:23:48,653 - INFO - Optimizer step 6: log_σ²=0.059946, weight=0.941815 2025-07-16 11:24:11,011 - INFO - log_σ² gradient: -0.503583 2025-07-16 11:24:11,093 - INFO - Optimizer step 7: log_σ²=0.060239, weight=0.941539 2025-07-16 11:24:34,422 - INFO - log_σ² gradient: -0.501324 2025-07-16 11:24:34,495 - INFO - Optimizer step 8: log_σ²=0.060534, weight=0.941262 2025-07-16 11:24:57,178 - INFO - log_σ² gradient: -0.505976 2025-07-16 11:24:57,252 - INFO - Optimizer step 9: log_σ²=0.060829, weight=0.940984 2025-07-16 11:25:19,212 - INFO - log_σ² gradient: -0.504133 2025-07-16 11:25:19,286 - INFO - Optimizer step 10: log_σ²=0.061126, weight=0.940704 2025-07-16 11:25:41,959 - INFO - log_σ² gradient: -0.505467 2025-07-16 11:25:42,034 - INFO - Optimizer step 11: log_σ²=0.061425, weight=0.940424 2025-07-16 11:26:05,923 - INFO - log_σ² gradient: -0.522611 2025-07-16 11:26:06,001 - INFO - Optimizer step 12: log_σ²=0.061726, weight=0.940141 2025-07-16 11:26:28,158 - INFO - log_σ² gradient: -0.542571 2025-07-16 11:26:28,228 - INFO - Optimizer step 13: log_σ²=0.062030, weight=0.939855 2025-07-16 11:26:51,432 - INFO - log_σ² gradient: -0.557848 2025-07-16 11:26:51,508 - INFO - Optimizer step 14: log_σ²=0.062338, weight=0.939566 2025-07-16 11:27:12,282 - INFO - log_σ² gradient: -0.553913 2025-07-16 11:27:12,354 - INFO - Optimizer step 15: log_σ²=0.062649, weight=0.939273 2025-07-16 11:27:35,163 - INFO - log_σ² gradient: -0.514126 2025-07-16 11:27:35,234 - INFO - Optimizer step 16: log_σ²=0.062961, weight=0.938980 2025-07-16 11:27:58,254 - INFO - log_σ² gradient: -0.511176 2025-07-16 11:27:58,327 - INFO - Optimizer step 17: log_σ²=0.063274, weight=0.938686 2025-07-16 11:28:22,769 - INFO - log_σ² gradient: -0.513312 2025-07-16 11:28:22,842 - INFO - Optimizer step 18: log_σ²=0.063588, weight=0.938392 2025-07-16 11:28:45,667 - INFO - log_σ² gradient: -0.510525 2025-07-16 11:28:45,737 - INFO - Optimizer step 19: log_σ²=0.063902, weight=0.938097 2025-07-16 11:29:07,113 - INFO - log_σ² gradient: -0.514193 2025-07-16 11:29:07,183 - INFO - Optimizer step 20: log_σ²=0.064218, weight=0.937801 2025-07-16 11:29:28,410 - INFO - log_σ² gradient: -0.547375 2025-07-16 11:29:28,476 - INFO - Optimizer step 21: log_σ²=0.064536, weight=0.937503 2025-07-16 11:29:50,795 - INFO - log_σ² gradient: -0.629511 2025-07-16 11:29:50,869 - INFO - Optimizer step 22: log_σ²=0.064861, weight=0.937198 2025-07-16 11:30:15,102 - INFO - log_σ² gradient: -0.677429 2025-07-16 11:30:15,175 - INFO - Optimizer step 23: log_σ²=0.065196, weight=0.936883 2025-07-16 11:30:38,672 - INFO - log_σ² gradient: -0.739648 2025-07-16 11:30:38,742 - INFO - Optimizer step 24: log_σ²=0.065544, weight=0.936558 2025-07-16 11:31:01,570 - INFO - log_σ² gradient: -0.738089 2025-07-16 11:31:01,643 - INFO - Optimizer step 25: log_σ²=0.065903, weight=0.936221 2025-07-16 11:31:25,301 - INFO - log_σ² gradient: -0.635700 2025-07-16 11:31:25,372 - INFO - Optimizer step 26: log_σ²=0.066267, weight=0.935881 2025-07-16 11:31:48,768 - INFO - log_σ² gradient: -0.570009 2025-07-16 11:31:48,837 - INFO - Optimizer step 27: log_σ²=0.066630, weight=0.935542 2025-07-16 11:32:11,314 - INFO - log_σ² gradient: -0.545476 2025-07-16 11:32:11,388 - INFO - Optimizer step 28: log_σ²=0.066991, weight=0.935204 2025-07-16 11:32:34,178 - INFO - log_σ² gradient: -0.725778 2025-07-16 11:32:34,246 - INFO - Optimizer step 29: log_σ²=0.067362, weight=0.934857 2025-07-16 11:32:56,557 - INFO - log_σ² gradient: -0.764338 2025-07-16 11:32:56,630 - INFO - Optimizer step 30: log_σ²=0.067744, weight=0.934500 2025-07-16 11:33:18,064 - INFO - log_σ² gradient: -0.588229 2025-07-16 11:33:18,134 - INFO - Optimizer step 31: log_σ²=0.068125, weight=0.934144 2025-07-16 11:33:42,193 - INFO - log_σ² gradient: -0.528920 2025-07-16 11:33:42,263 - INFO - Optimizer step 32: log_σ²=0.068502, weight=0.933791 2025-07-16 11:34:06,438 - INFO - log_σ² gradient: -0.525529 2025-07-16 11:34:06,510 - INFO - Optimizer step 33: log_σ²=0.068875, weight=0.933443 2025-07-16 11:34:29,201 - INFO - log_σ² gradient: -0.532767 2025-07-16 11:34:29,273 - INFO - Optimizer step 34: log_σ²=0.069246, weight=0.933097 2025-07-16 11:34:51,506 - INFO - log_σ² gradient: -0.558941 2025-07-16 11:34:51,579 - INFO - Optimizer step 35: log_σ²=0.069615, weight=0.932753 2025-07-16 11:35:14,780 - INFO - log_σ² gradient: -0.588809 2025-07-16 11:35:14,857 - INFO - Optimizer step 36: log_σ²=0.069985, weight=0.932407 2025-07-16 11:35:36,527 - INFO - log_σ² gradient: -0.542670 2025-07-16 11:35:36,597 - INFO - Optimizer step 37: log_σ²=0.070354, weight=0.932064 2025-07-16 11:35:59,635 - INFO - log_σ² gradient: -0.526889 2025-07-16 11:35:59,705 - INFO - Optimizer step 38: log_σ²=0.070720, weight=0.931723 2025-07-16 11:36:21,912 - INFO - log_σ² gradient: -0.535543 2025-07-16 11:36:21,991 - INFO - Optimizer step 39: log_σ²=0.071084, weight=0.931384 2025-07-16 11:36:45,559 - INFO - log_σ² gradient: -0.523105 2025-07-16 11:36:45,632 - INFO - Optimizer step 40: log_σ²=0.071446, weight=0.931047 2025-07-16 11:37:08,097 - INFO - log_σ² gradient: -0.557049 2025-07-16 11:37:08,164 - INFO - Optimizer step 41: log_σ²=0.071808, weight=0.930709 2025-07-16 11:37:31,233 - INFO - log_σ² gradient: -0.539576 2025-07-16 11:37:31,312 - INFO - Optimizer step 42: log_σ²=0.072169, weight=0.930373 2025-07-16 11:37:54,285 - INFO - log_σ² gradient: -0.551508 2025-07-16 11:37:54,363 - INFO - Optimizer step 43: log_σ²=0.072531, weight=0.930037 2025-07-16 11:38:17,372 - INFO - log_σ² gradient: -0.632882 2025-07-16 11:38:17,447 - INFO - Optimizer step 44: log_σ²=0.072897, weight=0.929696 2025-07-16 11:38:39,576 - INFO - log_σ² gradient: -0.599468 2025-07-16 11:38:39,647 - INFO - Optimizer step 45: log_σ²=0.073267, weight=0.929353 2025-07-16 11:39:02,396 - INFO - log_σ² gradient: -0.562022 2025-07-16 11:39:02,468 - INFO - Optimizer step 46: log_σ²=0.073636, weight=0.929010 2025-07-16 11:39:26,584 - INFO - log_σ² gradient: -0.555501 2025-07-16 11:39:26,656 - INFO - Optimizer step 47: log_σ²=0.074005, weight=0.928667 2025-07-16 11:39:49,297 - INFO - log_σ² gradient: -0.537252 2025-07-16 11:39:49,367 - INFO - Optimizer step 48: log_σ²=0.074373, weight=0.928325 2025-07-16 11:40:00,058 - INFO - log_σ² gradient: -0.241874 2025-07-16 11:40:00,138 - INFO - Optimizer step 49: log_σ²=0.074722, weight=0.928002 2025-07-16 11:40:00,397 - INFO - Epoch 9: Total optimizer steps: 49 2025-07-16 11:43:00,488 - INFO - Validation metrics: 2025-07-16 11:43:00,488 - INFO - Loss: 0.6989 2025-07-16 11:43:00,488 - INFO - BCE Loss: 0.5267 2025-07-16 11:43:00,488 - INFO - Weighted BCE Loss: 0.4888 2025-07-16 11:43:00,488 - INFO - Average similarity: 0.6940 2025-07-16 11:43:00,488 - INFO - Median similarity: 0.7147 2025-07-16 11:43:00,488 - INFO - Clean sample similarity: 0.6940 2025-07-16 11:43:00,488 - INFO - Corrupted sample similarity: 0.4796 2025-07-16 11:43:00,489 - INFO - Similarity gap (clean - corrupt): 0.2144 2025-07-16 11:43:00,687 - INFO - Epoch 9/30 - Train Loss: 0.8725, Val Loss: 0.6989, Val BCE: 0.5267, Val wBCE: 0.4888, Clean Sim: 0.6940, Corrupt Sim: 0.4796, Gap: 0.2144, Time: 1299.96s 2025-07-16 11:43:34,809 - INFO - log_σ² gradient: -0.524367 2025-07-16 11:43:34,882 - INFO - Optimizer step 1: log_σ²=0.075070, weight=0.927679 2025-07-16 11:43:57,210 - INFO - log_σ² gradient: -0.518596 2025-07-16 11:43:57,284 - INFO - Optimizer step 2: log_σ²=0.075418, weight=0.927356 2025-07-16 11:44:19,614 - INFO - log_σ² gradient: -0.521800 2025-07-16 11:44:19,693 - INFO - Optimizer step 3: log_σ²=0.075766, weight=0.927033 2025-07-16 11:44:44,865 - INFO - log_σ² gradient: -0.513255 2025-07-16 11:44:44,938 - INFO - Optimizer step 4: log_σ²=0.076114, weight=0.926710 2025-07-16 11:45:08,075 - INFO - log_σ² gradient: -0.514677 2025-07-16 11:45:08,149 - INFO - Optimizer step 5: log_σ²=0.076462, weight=0.926388 2025-07-16 11:45:30,704 - INFO - log_σ² gradient: -0.513624 2025-07-16 11:45:30,777 - INFO - Optimizer step 6: log_σ²=0.076810, weight=0.926066 2025-07-16 11:45:53,546 - INFO - log_σ² gradient: -0.504107 2025-07-16 11:45:53,621 - INFO - Optimizer step 7: log_σ²=0.077157, weight=0.925744 2025-07-16 11:46:16,386 - INFO - log_σ² gradient: -0.500905 2025-07-16 11:46:16,456 - INFO - Optimizer step 8: log_σ²=0.077504, weight=0.925423 2025-07-16 11:46:39,844 - INFO - log_σ² gradient: -0.504462 2025-07-16 11:46:39,921 - INFO - Optimizer step 9: log_σ²=0.077850, weight=0.925103 2025-07-16 11:47:04,435 - INFO - log_σ² gradient: -0.503925 2025-07-16 11:47:04,502 - INFO - Optimizer step 10: log_σ²=0.078196, weight=0.924783 2025-07-16 11:47:26,678 - INFO - log_σ² gradient: -0.505726 2025-07-16 11:47:26,748 - INFO - Optimizer step 11: log_σ²=0.078542, weight=0.924463 2025-07-16 11:47:51,080 - INFO - log_σ² gradient: -0.499523 2025-07-16 11:47:51,162 - INFO - Optimizer step 12: log_σ²=0.078888, weight=0.924144 2025-07-16 11:48:14,450 - INFO - log_σ² gradient: -0.509638 2025-07-16 11:48:14,519 - INFO - Optimizer step 13: log_σ²=0.079234, weight=0.923824 2025-07-16 11:48:36,549 - INFO - log_σ² gradient: -0.497109 2025-07-16 11:48:36,619 - INFO - Optimizer step 14: log_σ²=0.079580, weight=0.923504 2025-07-16 11:49:00,383 - INFO - log_σ² gradient: -0.491271 2025-07-16 11:49:00,455 - INFO - Optimizer step 15: log_σ²=0.079925, weight=0.923186 2025-07-16 11:49:22,978 - INFO - log_σ² gradient: -0.499891 2025-07-16 11:49:23,049 - INFO - Optimizer step 16: log_σ²=0.080270, weight=0.922867 2025-07-16 11:49:45,318 - INFO - log_σ² gradient: -0.495250 2025-07-16 11:49:45,387 - INFO - Optimizer step 17: log_σ²=0.080615, weight=0.922549 2025-07-16 11:50:07,509 - INFO - log_σ² gradient: -0.501872 2025-07-16 11:50:07,579 - INFO - Optimizer step 18: log_σ²=0.080961, weight=0.922230 2025-07-16 11:50:29,302 - INFO - log_σ² gradient: -0.488271 2025-07-16 11:50:29,384 - INFO - Optimizer step 19: log_σ²=0.081305, weight=0.921912 2025-07-16 11:50:52,246 - INFO - log_σ² gradient: -0.503951 2025-07-16 11:50:52,319 - INFO - Optimizer step 20: log_σ²=0.081651, weight=0.921593 2025-07-16 11:51:14,333 - INFO - log_σ² gradient: -0.497236 2025-07-16 11:51:14,407 - INFO - Optimizer step 21: log_σ²=0.081997, weight=0.921275 2025-07-16 11:51:36,574 - INFO - log_σ² gradient: -0.489772 2025-07-16 11:51:36,647 - INFO - Optimizer step 22: log_σ²=0.082343, weight=0.920956 2025-07-16 11:51:59,421 - INFO - log_σ² gradient: -0.495478 2025-07-16 11:51:59,494 - INFO - Optimizer step 23: log_σ²=0.082689, weight=0.920638 2025-07-16 11:52:22,293 - INFO - log_σ² gradient: -0.489603 2025-07-16 11:52:22,366 - INFO - Optimizer step 24: log_σ²=0.083035, weight=0.920319 2025-07-16 11:52:44,067 - INFO - log_σ² gradient: -0.494158 2025-07-16 11:52:44,145 - INFO - Optimizer step 25: log_σ²=0.083381, weight=0.920001 2025-07-16 11:53:06,811 - INFO - log_σ² gradient: -0.499176 2025-07-16 11:53:06,889 - INFO - Optimizer step 26: log_σ²=0.083728, weight=0.919681 2025-07-16 11:53:29,578 - INFO - log_σ² gradient: -0.487194 2025-07-16 11:53:29,651 - INFO - Optimizer step 27: log_σ²=0.084075, weight=0.919362 2025-07-16 11:53:52,222 - INFO - log_σ² gradient: -0.493224 2025-07-16 11:53:52,292 - INFO - Optimizer step 28: log_σ²=0.084422, weight=0.919043 2025-07-16 11:54:15,970 - INFO - log_σ² gradient: -0.493789 2025-07-16 11:54:16,048 - INFO - Optimizer step 29: log_σ²=0.084770, weight=0.918724 2025-07-16 11:54:38,242 - INFO - log_σ² gradient: -0.500121 2025-07-16 11:54:38,249 - INFO - Optimizer step 30: log_σ²=0.085119, weight=0.918403 2025-07-16 11:54:59,995 - INFO - log_σ² gradient: -0.483033 2025-07-16 11:55:00,068 - INFO - Optimizer step 31: log_σ²=0.085467, weight=0.918083 2025-07-16 11:55:21,258 - INFO - log_σ² gradient: -0.489658 2025-07-16 11:55:21,332 - INFO - Optimizer step 32: log_σ²=0.085816, weight=0.917763 2025-07-16 11:55:44,562 - INFO - log_σ² gradient: -0.479115 2025-07-16 11:55:44,636 - INFO - Optimizer step 33: log_σ²=0.086165, weight=0.917443 2025-07-16 11:56:07,281 - INFO - log_σ² gradient: -0.489810 2025-07-16 11:56:07,351 - INFO - Optimizer step 34: log_σ²=0.086514, weight=0.917123 2025-07-16 11:56:29,240 - INFO - log_σ² gradient: -0.493721 2025-07-16 11:56:29,307 - INFO - Optimizer step 35: log_σ²=0.086863, weight=0.916802 2025-07-16 11:56:50,661 - INFO - log_σ² gradient: -0.482316 2025-07-16 11:56:50,734 - INFO - Optimizer step 36: log_σ²=0.087213, weight=0.916482 2025-07-16 11:57:12,480 - INFO - log_σ² gradient: -0.485248 2025-07-16 11:57:12,550 - INFO - Optimizer step 37: log_σ²=0.087563, weight=0.916161 2025-07-16 11:57:33,911 - INFO - log_σ² gradient: -0.491957 2025-07-16 11:57:33,982 - INFO - Optimizer step 38: log_σ²=0.087914, weight=0.915840 2025-07-16 11:57:55,737 - INFO - log_σ² gradient: -0.489267 2025-07-16 11:57:55,816 - INFO - Optimizer step 39: log_σ²=0.088266, weight=0.915518 2025-07-16 11:58:17,364 - INFO - log_σ² gradient: -0.497547 2025-07-16 11:58:17,442 - INFO - Optimizer step 40: log_σ²=0.088618, weight=0.915195 2025-07-16 11:58:39,739 - INFO - log_σ² gradient: -0.495060 2025-07-16 11:58:39,812 - INFO - Optimizer step 41: log_σ²=0.088972, weight=0.914871 2025-07-16 11:59:01,894 - INFO - log_σ² gradient: -0.495531 2025-07-16 11:59:01,972 - INFO - Optimizer step 42: log_σ²=0.089327, weight=0.914547 2025-07-16 11:59:25,189 - INFO - log_σ² gradient: -0.495333 2025-07-16 11:59:25,268 - INFO - Optimizer step 43: log_σ²=0.089682, weight=0.914221 2025-07-16 11:59:47,118 - INFO - log_σ² gradient: -0.480317 2025-07-16 11:59:47,191 - INFO - Optimizer step 44: log_σ²=0.090038, weight=0.913896 2025-07-16 12:00:09,034 - INFO - log_σ² gradient: -0.478702 2025-07-16 12:00:09,108 - INFO - Optimizer step 45: log_σ²=0.090394, weight=0.913572 2025-07-16 12:00:30,588 - INFO - log_σ² gradient: -0.484337 2025-07-16 12:00:30,658 - INFO - Optimizer step 46: log_σ²=0.090749, weight=0.913247 2025-07-16 12:00:51,982 - INFO - log_σ² gradient: -0.483863 2025-07-16 12:00:52,060 - INFO - Optimizer step 47: log_σ²=0.091106, weight=0.912921 2025-07-16 12:01:11,683 - INFO - log_σ² gradient: -0.485469 2025-07-16 12:01:11,757 - INFO - Optimizer step 48: log_σ²=0.091462, weight=0.912596 2025-07-16 12:01:22,432 - INFO - log_σ² gradient: -0.226662 2025-07-16 12:01:22,503 - INFO - Optimizer step 49: log_σ²=0.091801, weight=0.912287 2025-07-16 12:01:22,782 - INFO - Epoch 10: Total optimizer steps: 49 2025-07-16 12:04:17,978 - INFO - Validation metrics: 2025-07-16 12:04:17,978 - INFO - Loss: 0.6282 2025-07-16 12:04:17,978 - INFO - BCE Loss: 0.4865 2025-07-16 12:04:17,978 - INFO - Weighted BCE Loss: 0.4438 2025-07-16 12:04:17,978 - INFO - Average similarity: 0.6218 2025-07-16 12:04:17,978 - INFO - Median similarity: 0.6319 2025-07-16 12:04:17,978 - INFO - Clean sample similarity: 0.6218 2025-07-16 12:04:17,978 - INFO - Corrupted sample similarity: 0.3554 2025-07-16 12:04:17,978 - INFO - Similarity gap (clean - corrupt): 0.2664 2025-07-16 12:04:18,199 - INFO - Epoch 10/30 - Train Loss: 0.7031, Val Loss: 0.6282, Val BCE: 0.4865, Val wBCE: 0.4438, Clean Sim: 0.6218, Corrupt Sim: 0.3554, Gap: 0.2664, Time: 1277.51s 2025-07-16 12:04:18,199 - INFO - New best validation loss: 0.6282 2025-07-16 12:07:05,118 - INFO - Epoch 10 Validation Alignment: Pos=0.165, Neg=0.113, Gap=0.052 2025-07-16 12:07:37,856 - INFO - log_σ² gradient: -0.481695 2025-07-16 12:07:37,931 - INFO - Optimizer step 1: log_σ²=0.092142, weight=0.911976 2025-07-16 12:07:58,586 - INFO - log_σ² gradient: -0.475367 2025-07-16 12:07:58,656 - INFO - Optimizer step 2: log_σ²=0.092484, weight=0.911663 2025-07-16 12:08:19,537 - INFO - log_σ² gradient: -0.488800 2025-07-16 12:08:19,609 - INFO - Optimizer step 3: log_σ²=0.092829, weight=0.911349 2025-07-16 12:08:41,016 - INFO - log_σ² gradient: -0.478493 2025-07-16 12:08:41,089 - INFO - Optimizer step 4: log_σ²=0.093176, weight=0.911033 2025-07-16 12:09:02,242 - INFO - log_σ² gradient: -0.488244 2025-07-16 12:09:02,325 - INFO - Optimizer step 5: log_σ²=0.093524, weight=0.910716 2025-07-16 12:09:24,055 - INFO - log_σ² gradient: -0.487898 2025-07-16 12:09:24,129 - INFO - Optimizer step 6: log_σ²=0.093875, weight=0.910397 2025-07-16 12:09:44,224 - INFO - log_σ² gradient: -0.484634 2025-07-16 12:09:44,306 - INFO - Optimizer step 7: log_σ²=0.094227, weight=0.910076 2025-07-16 12:10:05,020 - INFO - log_σ² gradient: -0.483612 2025-07-16 12:10:05,093 - INFO - Optimizer step 8: log_σ²=0.094581, weight=0.909754 2025-07-16 12:10:26,717 - INFO - log_σ² gradient: -0.479013 2025-07-16 12:10:26,788 - INFO - Optimizer step 9: log_σ²=0.094936, weight=0.909431 2025-07-16 12:10:47,860 - INFO - log_σ² gradient: -0.486041 2025-07-16 12:10:47,937 - INFO - Optimizer step 10: log_σ²=0.095293, weight=0.909107 2025-07-16 12:11:08,759 - INFO - log_σ² gradient: -0.478711 2025-07-16 12:11:08,832 - INFO - Optimizer step 11: log_σ²=0.095650, weight=0.908782 2025-07-16 12:11:31,600 - INFO - log_σ² gradient: -0.470141 2025-07-16 12:11:31,677 - INFO - Optimizer step 12: log_σ²=0.096008, weight=0.908456 2025-07-16 12:11:53,443 - INFO - log_σ² gradient: -0.484419 2025-07-16 12:11:53,515 - INFO - Optimizer step 13: log_σ²=0.096368, weight=0.908130 2025-07-16 12:12:12,949 - INFO - log_σ² gradient: -0.488628 2025-07-16 12:12:13,023 - INFO - Optimizer step 14: log_σ²=0.096729, weight=0.907802 2025-07-16 12:12:35,924 - INFO - log_σ² gradient: -0.483253 2025-07-16 12:12:36,001 - INFO - Optimizer step 15: log_σ²=0.097091, weight=0.907473 2025-07-16 12:12:57,182 - INFO - log_σ² gradient: -0.487794 2025-07-16 12:12:57,256 - INFO - Optimizer step 16: log_σ²=0.097455, weight=0.907143 2025-07-16 12:13:18,662 - INFO - log_σ² gradient: -0.482812 2025-07-16 12:13:18,730 - INFO - Optimizer step 17: log_σ²=0.097819, weight=0.906813 2025-07-16 12:13:40,048 - INFO - log_σ² gradient: -0.483993 2025-07-16 12:13:40,117 - INFO - Optimizer step 18: log_σ²=0.098185, weight=0.906481 2025-07-16 12:14:01,214 - INFO - log_σ² gradient: -0.476954 2025-07-16 12:14:01,287 - INFO - Optimizer step 19: log_σ²=0.098552, weight=0.906149 2025-07-16 12:14:22,903 - INFO - log_σ² gradient: -0.477233 2025-07-16 12:14:22,982 - INFO - Optimizer step 20: log_σ²=0.098919, weight=0.905816 2025-07-16 12:14:45,422 - INFO - log_σ² gradient: -0.465406 2025-07-16 12:14:45,499 - INFO - Optimizer step 21: log_σ²=0.099286, weight=0.905484 2025-07-16 12:15:06,492 - INFO - log_σ² gradient: -0.490059 2025-07-16 12:15:06,563 - INFO - Optimizer step 22: log_σ²=0.099654, weight=0.905150 2025-07-16 12:15:28,212 - INFO - log_σ² gradient: -0.482112 2025-07-16 12:15:28,282 - INFO - Optimizer step 23: log_σ²=0.100024, weight=0.904816 2025-07-16 12:15:48,880 - INFO - log_σ² gradient: -0.472202 2025-07-16 12:15:48,954 - INFO - Optimizer step 24: log_σ²=0.100393, weight=0.904481 2025-07-16 12:16:09,205 - INFO - log_σ² gradient: -0.487132 2025-07-16 12:16:09,275 - INFO - Optimizer step 25: log_σ²=0.100765, weight=0.904146 2025-07-16 12:16:32,201 - INFO - log_σ² gradient: -0.479444 2025-07-16 12:16:32,287 - INFO - Optimizer step 26: log_σ²=0.101137, weight=0.903809 2025-07-16 12:16:53,828 - INFO - log_σ² gradient: -0.474467 2025-07-16 12:16:53,898 - INFO - Optimizer step 27: log_σ²=0.101509, weight=0.903473 2025-07-16 12:17:14,962 - INFO - log_σ² gradient: -0.486288 2025-07-16 12:17:15,034 - INFO - Optimizer step 28: log_σ²=0.101883, weight=0.903135 2025-07-16 12:17:36,202 - INFO - log_σ² gradient: -0.480342 2025-07-16 12:17:36,271 - INFO - Optimizer step 29: log_σ²=0.102257, weight=0.902797 2025-07-16 12:17:57,725 - INFO - log_σ² gradient: -0.480242 2025-07-16 12:17:57,795 - INFO - Optimizer step 30: log_σ²=0.102633, weight=0.902458 2025-07-16 12:18:18,492 - INFO - log_σ² gradient: -0.486256 2025-07-16 12:18:18,573 - INFO - Optimizer step 31: log_σ²=0.103009, weight=0.902118 2025-07-16 12:18:41,508 - INFO - log_σ² gradient: -0.475791 2025-07-16 12:18:41,581 - INFO - Optimizer step 32: log_σ²=0.103387, weight=0.901778 2025-07-16 12:19:03,095 - INFO - log_σ² gradient: -0.489252 2025-07-16 12:19:03,167 - INFO - Optimizer step 33: log_σ²=0.103765, weight=0.901437 2025-07-16 12:19:24,349 - INFO - log_σ² gradient: -0.471951 2025-07-16 12:19:24,421 - INFO - Optimizer step 34: log_σ²=0.104144, weight=0.901096 2025-07-16 12:19:45,617 - INFO - log_σ² gradient: -0.478019 2025-07-16 12:19:45,691 - INFO - Optimizer step 35: log_σ²=0.104523, weight=0.900754 2025-07-16 12:20:06,797 - INFO - log_σ² gradient: -0.474524 2025-07-16 12:20:06,871 - INFO - Optimizer step 36: log_σ²=0.104903, weight=0.900412 2025-07-16 12:20:28,946 - INFO - log_σ² gradient: -0.469375 2025-07-16 12:20:29,018 - INFO - Optimizer step 37: log_σ²=0.105283, weight=0.900070 2025-07-16 12:20:50,985 - INFO - log_σ² gradient: -0.473093 2025-07-16 12:20:51,063 - INFO - Optimizer step 38: log_σ²=0.105663, weight=0.899728 2025-07-16 12:21:12,491 - INFO - log_σ² gradient: -0.468875 2025-07-16 12:21:12,567 - INFO - Optimizer step 39: log_σ²=0.106043, weight=0.899386 2025-07-16 12:21:34,812 - INFO - log_σ² gradient: -0.474165 2025-07-16 12:21:34,890 - INFO - Optimizer step 40: log_σ²=0.106424, weight=0.899043 2025-07-16 12:21:56,570 - INFO - log_σ² gradient: -0.470201 2025-07-16 12:21:56,645 - INFO - Optimizer step 41: log_σ²=0.106805, weight=0.898700 2025-07-16 12:22:18,486 - INFO - log_σ² gradient: -0.471927 2025-07-16 12:22:18,561 - INFO - Optimizer step 42: log_σ²=0.107187, weight=0.898358 2025-07-16 12:22:39,369 - INFO - log_σ² gradient: -0.480961 2025-07-16 12:22:39,439 - INFO - Optimizer step 43: log_σ²=0.107570, weight=0.898014 2025-07-16 12:22:58,519 - INFO - log_σ² gradient: -0.468281 2025-07-16 12:22:58,589 - INFO - Optimizer step 44: log_σ²=0.107953, weight=0.897670 2025-07-16 12:23:21,097 - INFO - log_σ² gradient: -0.473082 2025-07-16 12:23:21,180 - INFO - Optimizer step 45: log_σ²=0.108337, weight=0.897325 2025-07-16 12:23:42,431 - INFO - log_σ² gradient: -0.476086 2025-07-16 12:23:42,509 - INFO - Optimizer step 46: log_σ²=0.108721, weight=0.896980 2025-07-16 12:24:03,686 - INFO - log_σ² gradient: -0.467600 2025-07-16 12:24:03,759 - INFO - Optimizer step 47: log_σ²=0.109106, weight=0.896635 2025-07-16 12:24:25,206 - INFO - log_σ² gradient: -0.468320 2025-07-16 12:24:25,284 - INFO - Optimizer step 48: log_σ²=0.109491, weight=0.896290 2025-07-16 12:24:35,242 - INFO - log_σ² gradient: -0.216205 2025-07-16 12:24:35,321 - INFO - Optimizer step 49: log_σ²=0.109856, weight=0.895963 2025-07-16 12:24:35,563 - INFO - Epoch 11: Total optimizer steps: 49 2025-07-16 12:27:30,479 - INFO - Validation metrics: 2025-07-16 12:27:30,479 - INFO - Loss: 0.6204 2025-07-16 12:27:30,479 - INFO - BCE Loss: 0.4782 2025-07-16 12:27:30,479 - INFO - Weighted BCE Loss: 0.4284 2025-07-16 12:27:30,479 - INFO - Average similarity: 0.7388 2025-07-16 12:27:30,479 - INFO - Median similarity: 0.7672 2025-07-16 12:27:30,479 - INFO - Clean sample similarity: 0.7388 2025-07-16 12:27:30,479 - INFO - Corrupted sample similarity: 0.4349 2025-07-16 12:27:30,479 - INFO - Similarity gap (clean - corrupt): 0.3040 2025-07-16 12:27:30,653 - INFO - Epoch 11/30 - Train Loss: 0.6574, Val Loss: 0.6204, Val BCE: 0.4782, Val wBCE: 0.4284, Clean Sim: 0.7388, Corrupt Sim: 0.4349, Gap: 0.3040, Time: 1225.53s 2025-07-16 12:27:30,654 - INFO - New best validation loss: 0.6204 2025-07-16 12:27:33,294 - INFO - New best similarity gap: 0.3040 2025-07-16 12:28:09,412 - INFO - log_σ² gradient: -0.476357 2025-07-16 12:28:09,490 - INFO - Optimizer step 1: log_σ²=0.110225, weight=0.895633 2025-07-16 12:28:31,076 - INFO - log_σ² gradient: -0.479305 2025-07-16 12:28:31,146 - INFO - Optimizer step 2: log_σ²=0.110596, weight=0.895300 2025-07-16 12:28:53,204 - INFO - log_σ² gradient: -0.469294 2025-07-16 12:28:53,276 - INFO - Optimizer step 3: log_σ²=0.110969, weight=0.894966 2025-07-16 12:29:14,925 - INFO - log_σ² gradient: -0.471503 2025-07-16 12:29:14,999 - INFO - Optimizer step 4: log_σ²=0.111345, weight=0.894630 2025-07-16 12:29:37,466 - INFO - log_σ² gradient: -0.467852 2025-07-16 12:29:37,540 - INFO - Optimizer step 5: log_σ²=0.111722, weight=0.894293 2025-07-16 12:29:59,888 - INFO - log_σ² gradient: -0.478037 2025-07-16 12:29:59,961 - INFO - Optimizer step 6: log_σ²=0.112102, weight=0.893953 2025-07-16 12:30:21,923 - INFO - log_σ² gradient: -0.475944 2025-07-16 12:30:21,993 - INFO - Optimizer step 7: log_σ²=0.112483, weight=0.893612 2025-07-16 12:30:43,664 - INFO - log_σ² gradient: -0.468051 2025-07-16 12:30:43,739 - INFO - Optimizer step 8: log_σ²=0.112867, weight=0.893270 2025-07-16 12:31:05,447 - INFO - log_σ² gradient: -0.471522 2025-07-16 12:31:05,519 - INFO - Optimizer step 9: log_σ²=0.113251, weight=0.892926 2025-07-16 12:31:27,702 - INFO - log_σ² gradient: -0.476198 2025-07-16 12:31:27,776 - INFO - Optimizer step 10: log_σ²=0.113638, weight=0.892581 2025-07-16 12:31:50,303 - INFO - log_σ² gradient: -0.486339 2025-07-16 12:31:50,376 - INFO - Optimizer step 11: log_σ²=0.114027, weight=0.892234 2025-07-16 12:32:11,027 - INFO - log_σ² gradient: -0.477192 2025-07-16 12:32:11,094 - INFO - Optimizer step 12: log_σ²=0.114417, weight=0.891886 2025-07-16 12:32:32,888 - INFO - log_σ² gradient: -0.471831 2025-07-16 12:32:32,962 - INFO - Optimizer step 13: log_σ²=0.114809, weight=0.891536 2025-07-16 12:32:55,112 - INFO - log_σ² gradient: -0.475967 2025-07-16 12:32:55,182 - INFO - Optimizer step 14: log_σ²=0.115203, weight=0.891186 2025-07-16 12:33:16,342 - INFO - log_σ² gradient: -0.470287 2025-07-16 12:33:16,412 - INFO - Optimizer step 15: log_σ²=0.115597, weight=0.890834 2025-07-16 12:33:36,980 - INFO - log_σ² gradient: -0.468603 2025-07-16 12:33:37,050 - INFO - Optimizer step 16: log_σ²=0.115992, weight=0.890483 2025-07-16 12:33:58,508 - INFO - log_σ² gradient: -0.472668 2025-07-16 12:33:58,582 - INFO - Optimizer step 17: log_σ²=0.116388, weight=0.890130 2025-07-16 12:34:20,621 - INFO - log_σ² gradient: -0.475485 2025-07-16 12:34:20,699 - INFO - Optimizer step 18: log_σ²=0.116785, weight=0.889776 2025-07-16 12:34:42,308 - INFO - log_σ² gradient: -0.477349 2025-07-16 12:34:42,383 - INFO - Optimizer step 19: log_σ²=0.117184, weight=0.889421 2025-07-16 12:35:02,120 - INFO - log_σ² gradient: -0.459057 2025-07-16 12:35:02,193 - INFO - Optimizer step 20: log_σ²=0.117583, weight=0.889067 2025-07-16 12:35:24,064 - INFO - log_σ² gradient: -0.463442 2025-07-16 12:35:24,134 - INFO - Optimizer step 21: log_σ²=0.117982, weight=0.888712 2025-07-16 12:35:45,188 - INFO - log_σ² gradient: -0.479029 2025-07-16 12:35:45,261 - INFO - Optimizer step 22: log_σ²=0.118383, weight=0.888356 2025-07-16 12:36:06,637 - INFO - log_σ² gradient: -0.482367 2025-07-16 12:36:06,706 - INFO - Optimizer step 23: log_σ²=0.118785, weight=0.887999 2025-07-16 12:36:28,472 - INFO - log_σ² gradient: -0.464130 2025-07-16 12:36:28,542 - INFO - Optimizer step 24: log_σ²=0.119188, weight=0.887641 2025-07-16 12:36:49,203 - INFO - log_σ² gradient: -0.465735 2025-07-16 12:36:49,280 - INFO - Optimizer step 25: log_σ²=0.119591, weight=0.887283 2025-07-16 12:37:10,750 - INFO - log_σ² gradient: -0.480477 2025-07-16 12:37:10,828 - INFO - Optimizer step 26: log_σ²=0.119996, weight=0.886924 2025-07-16 12:37:31,583 - INFO - log_σ² gradient: -0.462309 2025-07-16 12:37:31,657 - INFO - Optimizer step 27: log_σ²=0.120400, weight=0.886565 2025-07-16 12:37:53,715 - INFO - log_σ² gradient: -0.468977 2025-07-16 12:37:53,792 - INFO - Optimizer step 28: log_σ²=0.120806, weight=0.886206 2025-07-16 12:38:14,563 - INFO - log_σ² gradient: -0.468256 2025-07-16 12:38:14,640 - INFO - Optimizer step 29: log_σ²=0.121212, weight=0.885846 2025-07-16 12:38:36,351 - INFO - log_σ² gradient: -0.465128 2025-07-16 12:38:36,421 - INFO - Optimizer step 30: log_σ²=0.121619, weight=0.885486 2025-07-16 12:38:57,677 - INFO - log_σ² gradient: -0.455628 2025-07-16 12:38:57,750 - INFO - Optimizer step 31: log_σ²=0.122025, weight=0.885126 2025-07-16 12:39:18,754 - INFO - log_σ² gradient: -0.464638 2025-07-16 12:39:18,828 - INFO - Optimizer step 32: log_σ²=0.122432, weight=0.884766 2025-07-16 12:39:39,927 - INFO - log_σ² gradient: -0.473950 2025-07-16 12:39:40,000 - INFO - Optimizer step 33: log_σ²=0.122840, weight=0.884405 2025-07-16 12:40:01,572 - INFO - log_σ² gradient: -0.464559 2025-07-16 12:40:01,650 - INFO - Optimizer step 34: log_σ²=0.123248, weight=0.884044 2025-07-16 12:40:23,250 - INFO - log_σ² gradient: -0.460013 2025-07-16 12:40:23,317 - INFO - Optimizer step 35: log_σ²=0.123657, weight=0.883683 2025-07-16 12:40:44,231 - INFO - log_σ² gradient: -0.465170 2025-07-16 12:40:44,301 - INFO - Optimizer step 36: log_σ²=0.124067, weight=0.883321 2025-07-16 12:41:05,592 - INFO - log_σ² gradient: -0.463168 2025-07-16 12:41:05,666 - INFO - Optimizer step 37: log_σ²=0.124476, weight=0.882959 2025-07-16 12:41:28,734 - INFO - log_σ² gradient: -0.472521 2025-07-16 12:41:28,804 - INFO - Optimizer step 38: log_σ²=0.124888, weight=0.882596 2025-07-16 12:41:50,503 - INFO - log_σ² gradient: -0.469765 2025-07-16 12:41:50,575 - INFO - Optimizer step 39: log_σ²=0.125300, weight=0.882232 2025-07-16 12:42:12,103 - INFO - log_σ² gradient: -0.468021 2025-07-16 12:42:12,171 - INFO - Optimizer step 40: log_σ²=0.125713, weight=0.881868 2025-07-16 12:42:33,255 - INFO - log_σ² gradient: -0.467800 2025-07-16 12:42:33,326 - INFO - Optimizer step 41: log_σ²=0.126127, weight=0.881503 2025-07-16 12:42:55,174 - INFO - log_σ² gradient: -0.467955 2025-07-16 12:42:55,246 - INFO - Optimizer step 42: log_σ²=0.126541, weight=0.881138 2025-07-16 12:43:16,989 - INFO - log_σ² gradient: -0.451843 2025-07-16 12:43:17,067 - INFO - Optimizer step 43: log_σ²=0.126956, weight=0.880773 2025-07-16 12:43:37,899 - INFO - log_σ² gradient: -0.481934 2025-07-16 12:43:37,971 - INFO - Optimizer step 44: log_σ²=0.127372, weight=0.880406 2025-07-16 12:44:00,103 - INFO - log_σ² gradient: -0.471591 2025-07-16 12:44:00,177 - INFO - Optimizer step 45: log_σ²=0.127789, weight=0.880039 2025-07-16 12:44:20,726 - INFO - log_σ² gradient: -0.461374 2025-07-16 12:44:20,805 - INFO - Optimizer step 46: log_σ²=0.128207, weight=0.879671 2025-07-16 12:44:41,476 - INFO - log_σ² gradient: -0.459210 2025-07-16 12:44:41,549 - INFO - Optimizer step 47: log_σ²=0.128625, weight=0.879304 2025-07-16 12:45:03,496 - INFO - log_σ² gradient: -0.462209 2025-07-16 12:45:03,568 - INFO - Optimizer step 48: log_σ²=0.129043, weight=0.878936 2025-07-16 12:45:13,468 - INFO - log_σ² gradient: -0.218634 2025-07-16 12:45:13,541 - INFO - Optimizer step 49: log_σ²=0.129440, weight=0.878587 2025-07-16 12:45:13,762 - INFO - Epoch 12: Total optimizer steps: 49 2025-07-16 12:48:10,099 - INFO - Validation metrics: 2025-07-16 12:48:10,099 - INFO - Loss: 0.5930 2025-07-16 12:48:10,099 - INFO - BCE Loss: 0.4623 2025-07-16 12:48:10,099 - INFO - Weighted BCE Loss: 0.4062 2025-07-16 12:48:10,099 - INFO - Average similarity: 0.7108 2025-07-16 12:48:10,099 - INFO - Median similarity: 0.7270 2025-07-16 12:48:10,099 - INFO - Clean sample similarity: 0.7108 2025-07-16 12:48:10,099 - INFO - Corrupted sample similarity: 0.4176 2025-07-16 12:48:10,099 - INFO - Similarity gap (clean - corrupt): 0.2932 2025-07-16 12:48:10,302 - INFO - Epoch 12/30 - Train Loss: 0.6374, Val Loss: 0.5930, Val BCE: 0.4623, Val wBCE: 0.4062, Clean Sim: 0.7108, Corrupt Sim: 0.4176, Gap: 0.2932, Time: 1233.61s 2025-07-16 12:48:10,302 - INFO - New best validation loss: 0.5930 2025-07-16 12:50:57,122 - INFO - Epoch 12 Validation Alignment: Pos=0.204, Neg=0.122, Gap=0.082 2025-07-16 12:51:30,052 - INFO - log_σ² gradient: -0.460109 2025-07-16 12:51:30,125 - INFO - Optimizer step 1: log_σ²=0.129840, weight=0.878236 2025-07-16 12:51:51,200 - INFO - log_σ² gradient: -0.458268 2025-07-16 12:51:51,270 - INFO - Optimizer step 2: log_σ²=0.130241, weight=0.877883 2025-07-16 12:52:11,676 - INFO - log_σ² gradient: -0.454732 2025-07-16 12:52:11,751 - INFO - Optimizer step 3: log_σ²=0.130645, weight=0.877529 2025-07-16 12:52:33,398 - INFO - log_σ² gradient: -0.474517 2025-07-16 12:52:33,472 - INFO - Optimizer step 4: log_σ²=0.131052, weight=0.877173 2025-07-16 12:52:54,683 - INFO - log_σ² gradient: -0.465244 2025-07-16 12:52:54,753 - INFO - Optimizer step 5: log_σ²=0.131461, weight=0.876814 2025-07-16 12:53:15,515 - INFO - log_σ² gradient: -0.461829 2025-07-16 12:53:15,584 - INFO - Optimizer step 6: log_σ²=0.131871, weight=0.876454 2025-07-16 12:53:37,383 - INFO - log_σ² gradient: -0.451288 2025-07-16 12:53:37,455 - INFO - Optimizer step 7: log_σ²=0.132283, weight=0.876093 2025-07-16 12:53:58,145 - INFO - log_σ² gradient: -0.461148 2025-07-16 12:53:58,217 - INFO - Optimizer step 8: log_σ²=0.132696, weight=0.875731 2025-07-16 12:54:19,714 - INFO - log_σ² gradient: -0.461159 2025-07-16 12:54:19,784 - INFO - Optimizer step 9: log_σ²=0.133111, weight=0.875368 2025-07-16 12:54:41,919 - INFO - log_σ² gradient: -0.463805 2025-07-16 12:54:41,994 - INFO - Optimizer step 10: log_σ²=0.133528, weight=0.875003 2025-07-16 12:55:04,128 - INFO - log_σ² gradient: -0.455390 2025-07-16 12:55:04,202 - INFO - Optimizer step 11: log_σ²=0.133946, weight=0.874638 2025-07-16 12:55:25,520 - INFO - log_σ² gradient: -0.462074 2025-07-16 12:55:25,593 - INFO - Optimizer step 12: log_σ²=0.134365, weight=0.874271 2025-07-16 12:55:46,766 - INFO - log_σ² gradient: -0.463344 2025-07-16 12:55:46,839 - INFO - Optimizer step 13: log_σ²=0.134785, weight=0.873903 2025-07-16 12:56:09,192 - INFO - log_σ² gradient: -0.454937 2025-07-16 12:56:09,264 - INFO - Optimizer step 14: log_σ²=0.135207, weight=0.873535 2025-07-16 12:56:30,927 - INFO - log_σ² gradient: -0.465175 2025-07-16 12:56:31,002 - INFO - Optimizer step 15: log_σ²=0.135630, weight=0.873166 2025-07-16 12:56:52,093 - INFO - log_σ² gradient: -0.462178 2025-07-16 12:56:52,167 - INFO - Optimizer step 16: log_σ²=0.136054, weight=0.872795 2025-07-16 12:57:13,099 - INFO - log_σ² gradient: -0.466989 2025-07-16 12:57:13,177 - INFO - Optimizer step 17: log_σ²=0.136480, weight=0.872424 2025-07-16 12:57:35,555 - INFO - log_σ² gradient: -0.458796 2025-07-16 12:57:35,630 - INFO - Optimizer step 18: log_σ²=0.136907, weight=0.872051 2025-07-16 12:57:56,887 - INFO - log_σ² gradient: -0.462386 2025-07-16 12:57:56,958 - INFO - Optimizer step 19: log_σ²=0.137335, weight=0.871678 2025-07-16 12:58:19,629 - INFO - log_σ² gradient: -0.458063 2025-07-16 12:58:19,700 - INFO - Optimizer step 20: log_σ²=0.137764, weight=0.871305 2025-07-16 12:58:42,687 - INFO - log_σ² gradient: -0.462488 2025-07-16 12:58:42,760 - INFO - Optimizer step 21: log_σ²=0.138194, weight=0.870930 2025-07-16 12:59:04,758 - INFO - log_σ² gradient: -0.451103 2025-07-16 12:59:04,835 - INFO - Optimizer step 22: log_σ²=0.138624, weight=0.870556 2025-07-16 12:59:24,820 - INFO - log_σ² gradient: -0.454444 2025-07-16 12:59:24,894 - INFO - Optimizer step 23: log_σ²=0.139054, weight=0.870181 2025-07-16 12:59:46,905 - INFO - log_σ² gradient: -0.453572 2025-07-16 12:59:46,976 - INFO - Optimizer step 24: log_σ²=0.139485, weight=0.869806 2025-07-16 13:00:08,521 - INFO - log_σ² gradient: -0.460787 2025-07-16 13:00:08,599 - INFO - Optimizer step 25: log_σ²=0.139917, weight=0.869430 2025-07-16 13:00:29,910 - INFO - log_σ² gradient: -0.448541 2025-07-16 13:00:29,982 - INFO - Optimizer step 26: log_σ²=0.140349, weight=0.869055 2025-07-16 13:00:51,996 - INFO - log_σ² gradient: -0.460167 2025-07-16 13:00:52,074 - INFO - Optimizer step 27: log_σ²=0.140782, weight=0.868679 2025-07-16 13:01:12,573 - INFO - log_σ² gradient: -0.438558 2025-07-16 13:01:12,647 - INFO - Optimizer step 28: log_σ²=0.141214, weight=0.868304 2025-07-16 13:01:33,739 - INFO - log_σ² gradient: -0.460164 2025-07-16 13:01:33,807 - INFO - Optimizer step 29: log_σ²=0.141647, weight=0.867927 2025-07-16 13:01:53,860 - INFO - log_σ² gradient: -0.455770 2025-07-16 13:01:53,939 - INFO - Optimizer step 30: log_σ²=0.142081, weight=0.867551 2025-07-16 13:02:15,222 - INFO - log_σ² gradient: -0.450884 2025-07-16 13:02:15,294 - INFO - Optimizer step 31: log_σ²=0.142516, weight=0.867174 2025-07-16 13:02:37,180 - INFO - log_σ² gradient: -0.455358 2025-07-16 13:02:37,258 - INFO - Optimizer step 32: log_σ²=0.142951, weight=0.866796 2025-07-16 13:02:58,103 - INFO - log_σ² gradient: -0.457046 2025-07-16 13:02:58,171 - INFO - Optimizer step 33: log_σ²=0.143387, weight=0.866418 2025-07-16 13:03:20,163 - INFO - log_σ² gradient: -0.453649 2025-07-16 13:03:20,235 - INFO - Optimizer step 34: log_σ²=0.143824, weight=0.866040 2025-07-16 13:03:42,675 - INFO - log_σ² gradient: -0.464817 2025-07-16 13:03:42,745 - INFO - Optimizer step 35: log_σ²=0.144263, weight=0.865660 2025-07-16 13:04:03,208 - INFO - log_σ² gradient: -0.454452 2025-07-16 13:04:03,277 - INFO - Optimizer step 36: log_σ²=0.144702, weight=0.865280 2025-07-16 13:04:24,740 - INFO - log_σ² gradient: -0.455238 2025-07-16 13:04:24,812 - INFO - Optimizer step 37: log_σ²=0.145142, weight=0.864899 2025-07-16 13:04:46,496 - INFO - log_σ² gradient: -0.447621 2025-07-16 13:04:46,569 - INFO - Optimizer step 38: log_σ²=0.145582, weight=0.864519 2025-07-16 13:05:08,500 - INFO - log_σ² gradient: -0.457443 2025-07-16 13:05:08,572 - INFO - Optimizer step 39: log_σ²=0.146023, weight=0.864138 2025-07-16 13:05:30,450 - INFO - log_σ² gradient: -0.449827 2025-07-16 13:05:30,531 - INFO - Optimizer step 40: log_σ²=0.146464, weight=0.863756 2025-07-16 13:05:51,634 - INFO - log_σ² gradient: -0.446650 2025-07-16 13:05:51,706 - INFO - Optimizer step 41: log_σ²=0.146906, weight=0.863375 2025-07-16 13:06:12,186 - INFO - log_σ² gradient: -0.450440 2025-07-16 13:06:12,258 - INFO - Optimizer step 42: log_σ²=0.147348, weight=0.862994 2025-07-16 13:06:33,901 - INFO - log_σ² gradient: -0.460837 2025-07-16 13:06:33,973 - INFO - Optimizer step 43: log_σ²=0.147791, weight=0.862611 2025-07-16 13:06:55,596 - INFO - log_σ² gradient: -0.456518 2025-07-16 13:06:55,667 - INFO - Optimizer step 44: log_σ²=0.148235, weight=0.862228 2025-07-16 13:07:17,966 - INFO - log_σ² gradient: -0.450189 2025-07-16 13:07:18,034 - INFO - Optimizer step 45: log_σ²=0.148680, weight=0.861845 2025-07-16 13:07:38,370 - INFO - log_σ² gradient: -0.446612 2025-07-16 13:07:38,441 - INFO - Optimizer step 46: log_σ²=0.149125, weight=0.861462 2025-07-16 13:08:01,167 - INFO - log_σ² gradient: -0.449803 2025-07-16 13:08:01,245 - INFO - Optimizer step 47: log_σ²=0.149570, weight=0.861078 2025-07-16 13:08:22,667 - INFO - log_σ² gradient: -0.448341 2025-07-16 13:08:22,735 - INFO - Optimizer step 48: log_σ²=0.150015, weight=0.860695 2025-07-16 13:08:32,594 - INFO - log_σ² gradient: -0.208198 2025-07-16 13:08:32,662 - INFO - Optimizer step 49: log_σ²=0.150438, weight=0.860331 2025-07-16 13:08:32,876 - INFO - Epoch 13: Total optimizer steps: 49 2025-07-16 13:11:28,692 - INFO - Validation metrics: 2025-07-16 13:11:28,692 - INFO - Loss: 0.5809 2025-07-16 13:11:28,692 - INFO - BCE Loss: 0.4511 2025-07-16 13:11:28,692 - INFO - Weighted BCE Loss: 0.3881 2025-07-16 13:11:28,692 - INFO - Average similarity: 0.7465 2025-07-16 13:11:28,692 - INFO - Median similarity: 0.7699 2025-07-16 13:11:28,692 - INFO - Clean sample similarity: 0.7465 2025-07-16 13:11:28,692 - INFO - Corrupted sample similarity: 0.4206 2025-07-16 13:11:28,693 - INFO - Similarity gap (clean - corrupt): 0.3259 2025-07-16 13:11:28,922 - INFO - Epoch 13/30 - Train Loss: 0.6206, Val Loss: 0.5809, Val BCE: 0.4511, Val wBCE: 0.3881, Clean Sim: 0.7465, Corrupt Sim: 0.4206, Gap: 0.3259, Time: 1231.80s 2025-07-16 13:11:28,922 - INFO - New best validation loss: 0.5809 2025-07-16 13:11:31,845 - INFO - New best similarity gap: 0.3259 2025-07-16 13:12:09,559 - INFO - log_σ² gradient: -0.444646 2025-07-16 13:12:09,629 - INFO - Optimizer step 1: log_σ²=0.150863, weight=0.859966 2025-07-16 13:12:32,153 - INFO - log_σ² gradient: -0.455043 2025-07-16 13:12:32,228 - INFO - Optimizer step 2: log_σ²=0.151291, weight=0.859598 2025-07-16 13:12:53,098 - INFO - log_σ² gradient: -0.452570 2025-07-16 13:12:53,168 - INFO - Optimizer step 3: log_σ²=0.151722, weight=0.859227 2025-07-16 13:13:15,328 - INFO - log_σ² gradient: -0.444921 2025-07-16 13:13:15,405 - INFO - Optimizer step 4: log_σ²=0.152155, weight=0.858855 2025-07-16 13:13:37,885 - INFO - log_σ² gradient: -0.454240 2025-07-16 13:13:37,957 - INFO - Optimizer step 5: log_σ²=0.152590, weight=0.858482 2025-07-16 13:13:59,535 - INFO - log_σ² gradient: -0.440871 2025-07-16 13:13:59,606 - INFO - Optimizer step 6: log_σ²=0.153027, weight=0.858107 2025-07-16 13:14:21,755 - INFO - log_σ² gradient: -0.452685 2025-07-16 13:14:21,829 - INFO - Optimizer step 7: log_σ²=0.153466, weight=0.857730 2025-07-16 13:14:43,828 - INFO - log_σ² gradient: -0.459328 2025-07-16 13:14:43,900 - INFO - Optimizer step 8: log_σ²=0.153907, weight=0.857351 2025-07-16 13:15:06,393 - INFO - log_σ² gradient: -0.448847 2025-07-16 13:15:06,471 - INFO - Optimizer step 9: log_σ²=0.154351, weight=0.856972 2025-07-16 13:15:28,942 - INFO - log_σ² gradient: -0.443207 2025-07-16 13:15:29,021 - INFO - Optimizer step 10: log_σ²=0.154795, weight=0.856591 2025-07-16 13:15:50,897 - INFO - log_σ² gradient: -0.452438 2025-07-16 13:15:50,977 - INFO - Optimizer step 11: log_σ²=0.155241, weight=0.856209 2025-07-16 13:16:14,035 - INFO - log_σ² gradient: -0.444342 2025-07-16 13:16:14,112 - INFO - Optimizer step 12: log_σ²=0.155688, weight=0.855826 2025-07-16 13:16:36,046 - INFO - log_σ² gradient: -0.449290 2025-07-16 13:16:36,119 - INFO - Optimizer step 13: log_σ²=0.156136, weight=0.855443 2025-07-16 13:16:59,334 - INFO - log_σ² gradient: -0.445453 2025-07-16 13:16:59,408 - INFO - Optimizer step 14: log_σ²=0.156586, weight=0.855058 2025-07-16 13:17:20,663 - INFO - log_σ² gradient: -0.450200 2025-07-16 13:17:20,737 - INFO - Optimizer step 15: log_σ²=0.157036, weight=0.854673 2025-07-16 13:17:43,637 - INFO - log_σ² gradient: -0.444909 2025-07-16 13:17:43,715 - INFO - Optimizer step 16: log_σ²=0.157488, weight=0.854287 2025-07-16 13:18:07,709 - INFO - log_σ² gradient: -0.455395 2025-07-16 13:18:07,783 - INFO - Optimizer step 17: log_σ²=0.157942, weight=0.853900 2025-07-16 13:18:28,502 - INFO - log_σ² gradient: -0.447333 2025-07-16 13:18:28,573 - INFO - Optimizer step 18: log_σ²=0.158396, weight=0.853512 2025-07-16 13:18:50,712 - INFO - log_σ² gradient: -0.444553 2025-07-16 13:18:50,786 - INFO - Optimizer step 19: log_σ²=0.158851, weight=0.853123 2025-07-16 13:19:11,763 - INFO - log_σ² gradient: -0.445911 2025-07-16 13:19:11,840 - INFO - Optimizer step 20: log_σ²=0.159307, weight=0.852734 2025-07-16 13:19:33,979 - INFO - log_σ² gradient: -0.442099 2025-07-16 13:19:34,050 - INFO - Optimizer step 21: log_σ²=0.159764, weight=0.852345 2025-07-16 13:19:55,639 - INFO - log_σ² gradient: -0.457880 2025-07-16 13:19:55,709 - INFO - Optimizer step 22: log_σ²=0.160222, weight=0.851955 2025-07-16 13:20:18,337 - INFO - log_σ² gradient: -0.444626 2025-07-16 13:20:18,414 - INFO - Optimizer step 23: log_σ²=0.160681, weight=0.851564 2025-07-16 13:20:40,180 - INFO - log_σ² gradient: -0.451456 2025-07-16 13:20:40,250 - INFO - Optimizer step 24: log_σ²=0.161141, weight=0.851172 2025-07-16 13:21:02,503 - INFO - log_σ² gradient: -0.439967 2025-07-16 13:21:02,572 - INFO - Optimizer step 25: log_σ²=0.161602, weight=0.850780 2025-07-16 13:21:25,010 - INFO - log_σ² gradient: -0.443880 2025-07-16 13:21:25,083 - INFO - Optimizer step 26: log_σ²=0.162063, weight=0.850388 2025-07-16 13:21:47,065 - INFO - log_σ² gradient: -0.440404 2025-07-16 13:21:47,147 - INFO - Optimizer step 27: log_σ²=0.162524, weight=0.849995 2025-07-16 13:22:08,120 - INFO - log_σ² gradient: -0.448956 2025-07-16 13:22:08,199 - INFO - Optimizer step 28: log_σ²=0.162987, weight=0.849602 2025-07-16 13:22:29,581 - INFO - log_σ² gradient: -0.443789 2025-07-16 13:22:29,653 - INFO - Optimizer step 29: log_σ²=0.163450, weight=0.849209 2025-07-16 13:22:51,340 - INFO - log_σ² gradient: -0.454027 2025-07-16 13:22:51,408 - INFO - Optimizer step 30: log_σ²=0.163915, weight=0.848814 2025-07-16 13:23:14,944 - INFO - log_σ² gradient: -0.449026 2025-07-16 13:23:15,022 - INFO - Optimizer step 31: log_σ²=0.164380, weight=0.848419 2025-07-16 13:23:37,478 - INFO - log_σ² gradient: -0.442243 2025-07-16 13:23:37,550 - INFO - Optimizer step 32: log_σ²=0.164847, weight=0.848024 2025-07-16 13:23:59,689 - INFO - log_σ² gradient: -0.439798 2025-07-16 13:23:59,761 - INFO - Optimizer step 33: log_σ²=0.165313, weight=0.847628 2025-07-16 13:24:22,161 - INFO - log_σ² gradient: -0.437244 2025-07-16 13:24:22,238 - INFO - Optimizer step 34: log_σ²=0.165779, weight=0.847233 2025-07-16 13:24:44,807 - INFO - log_σ² gradient: -0.448795 2025-07-16 13:24:44,880 - INFO - Optimizer step 35: log_σ²=0.166247, weight=0.846837 2025-07-16 13:25:05,673 - INFO - log_σ² gradient: -0.447918 2025-07-16 13:25:05,744 - INFO - Optimizer step 36: log_σ²=0.166715, weight=0.846441 2025-07-16 13:25:27,177 - INFO - log_σ² gradient: -0.444810 2025-07-16 13:25:27,249 - INFO - Optimizer step 37: log_σ²=0.167185, weight=0.846043 2025-07-16 13:25:48,862 - INFO - log_σ² gradient: -0.442056 2025-07-16 13:25:48,933 - INFO - Optimizer step 38: log_σ²=0.167654, weight=0.845646 2025-07-16 13:26:10,297 - INFO - log_σ² gradient: -0.442371 2025-07-16 13:26:10,367 - INFO - Optimizer step 39: log_σ²=0.168125, weight=0.845249 2025-07-16 13:26:31,008 - INFO - log_σ² gradient: -0.441929 2025-07-16 13:26:31,077 - INFO - Optimizer step 40: log_σ²=0.168595, weight=0.844851 2025-07-16 13:26:51,036 - INFO - log_σ² gradient: -0.449890 2025-07-16 13:26:51,109 - INFO - Optimizer step 41: log_σ²=0.169068, weight=0.844452 2025-07-16 13:27:13,208 - INFO - log_σ² gradient: -0.439518 2025-07-16 13:27:13,281 - INFO - Optimizer step 42: log_σ²=0.169540, weight=0.844053 2025-07-16 13:27:34,005 - INFO - log_σ² gradient: -0.434871 2025-07-16 13:27:34,083 - INFO - Optimizer step 43: log_σ²=0.170012, weight=0.843655 2025-07-16 13:27:55,838 - INFO - log_σ² gradient: -0.440489 2025-07-16 13:27:55,911 - INFO - Optimizer step 44: log_σ²=0.170485, weight=0.843256 2025-07-16 13:28:17,401 - INFO - log_σ² gradient: -0.438607 2025-07-16 13:28:17,471 - INFO - Optimizer step 45: log_σ²=0.170958, weight=0.842857 2025-07-16 13:28:38,616 - INFO - log_σ² gradient: -0.448336 2025-07-16 13:28:38,687 - INFO - Optimizer step 46: log_σ²=0.171432, weight=0.842457 2025-07-16 13:29:01,248 - INFO - log_σ² gradient: -0.431296 2025-07-16 13:29:01,323 - INFO - Optimizer step 47: log_σ²=0.171906, weight=0.842058 2025-07-16 13:29:22,400 - INFO - log_σ² gradient: -0.437096 2025-07-16 13:29:22,477 - INFO - Optimizer step 48: log_σ²=0.172381, weight=0.841659 2025-07-16 13:29:32,780 - INFO - log_σ² gradient: -0.202683 2025-07-16 13:29:32,852 - INFO - Optimizer step 49: log_σ²=0.172830, weight=0.841280 2025-07-16 13:29:33,078 - INFO - Epoch 14: Total optimizer steps: 49 2025-07-16 13:32:39,697 - INFO - Validation metrics: 2025-07-16 13:32:39,697 - INFO - Loss: 0.5610 2025-07-16 13:32:39,698 - INFO - BCE Loss: 0.4394 2025-07-16 13:32:39,698 - INFO - Weighted BCE Loss: 0.3697 2025-07-16 13:32:39,698 - INFO - Average similarity: 0.7379 2025-07-16 13:32:39,698 - INFO - Median similarity: 0.7567 2025-07-16 13:32:39,698 - INFO - Clean sample similarity: 0.7379 2025-07-16 13:32:39,698 - INFO - Corrupted sample similarity: 0.4141 2025-07-16 13:32:39,698 - INFO - Similarity gap (clean - corrupt): 0.3238 2025-07-16 13:32:39,903 - INFO - Epoch 14/30 - Train Loss: 0.6013, Val Loss: 0.5610, Val BCE: 0.4394, Val wBCE: 0.3697, Clean Sim: 0.7379, Corrupt Sim: 0.4141, Gap: 0.3238, Time: 1264.37s 2025-07-16 13:32:39,903 - INFO - New best validation loss: 0.5610 2025-07-16 13:35:31,749 - INFO - Epoch 14 Validation Alignment: Pos=0.186, Neg=0.103, Gap=0.082 2025-07-16 13:36:03,275 - INFO - log_σ² gradient: -0.441030 2025-07-16 13:36:03,349 - INFO - Optimizer step 1: log_σ²=0.173283, weight=0.840900 2025-07-16 13:36:24,176 - INFO - log_σ² gradient: -0.437414 2025-07-16 13:36:24,244 - INFO - Optimizer step 2: log_σ²=0.173739, weight=0.840517 2025-07-16 13:36:45,237 - INFO - log_σ² gradient: -0.437581 2025-07-16 13:36:45,307 - INFO - Optimizer step 3: log_σ²=0.174197, weight=0.840132 2025-07-16 13:37:07,142 - INFO - log_σ² gradient: -0.435569 2025-07-16 13:37:07,215 - INFO - Optimizer step 4: log_σ²=0.174657, weight=0.839745 2025-07-16 13:37:28,932 - INFO - log_σ² gradient: -0.449754 2025-07-16 13:37:29,014 - INFO - Optimizer step 5: log_σ²=0.175120, weight=0.839356 2025-07-16 13:37:50,988 - INFO - log_σ² gradient: -0.435414 2025-07-16 13:37:51,066 - INFO - Optimizer step 6: log_σ²=0.175586, weight=0.838966 2025-07-16 13:38:12,488 - INFO - log_σ² gradient: -0.438252 2025-07-16 13:38:12,560 - INFO - Optimizer step 7: log_σ²=0.176053, weight=0.838574 2025-07-16 13:38:35,355 - INFO - log_σ² gradient: -0.438660 2025-07-16 13:38:35,426 - INFO - Optimizer step 8: log_σ²=0.176522, weight=0.838180 2025-07-16 13:38:57,547 - INFO - log_σ² gradient: -0.439614 2025-07-16 13:38:57,621 - INFO - Optimizer step 9: log_σ²=0.176993, weight=0.837786 2025-07-16 13:39:19,872 - INFO - log_σ² gradient: -0.440565 2025-07-16 13:39:19,948 - INFO - Optimizer step 10: log_σ²=0.177466, weight=0.837390 2025-07-16 13:39:41,879 - INFO - log_σ² gradient: -0.442490 2025-07-16 13:39:41,951 - INFO - Optimizer step 11: log_σ²=0.177940, weight=0.836992 2025-07-16 13:40:02,912 - INFO - log_σ² gradient: -0.444353 2025-07-16 13:40:02,983 - INFO - Optimizer step 12: log_σ²=0.178417, weight=0.836593 2025-07-16 13:40:25,590 - INFO - log_σ² gradient: -0.440442 2025-07-16 13:40:25,664 - INFO - Optimizer step 13: log_σ²=0.178895, weight=0.836193 2025-07-16 13:40:47,073 - INFO - log_σ² gradient: -0.435963 2025-07-16 13:40:47,146 - INFO - Optimizer step 14: log_σ²=0.179375, weight=0.835793 2025-07-16 13:41:09,290 - INFO - log_σ² gradient: -0.444569 2025-07-16 13:41:09,366 - INFO - Optimizer step 15: log_σ²=0.179856, weight=0.835391 2025-07-16 13:41:30,461 - INFO - log_σ² gradient: -0.438586 2025-07-16 13:41:30,531 - INFO - Optimizer step 16: log_σ²=0.180338, weight=0.834988 2025-07-16 13:41:52,934 - INFO - log_σ² gradient: -0.429973 2025-07-16 13:41:53,008 - INFO - Optimizer step 17: log_σ²=0.180820, weight=0.834585 2025-07-16 13:42:14,163 - INFO - log_σ² gradient: -0.437058 2025-07-16 13:42:14,234 - INFO - Optimizer step 18: log_σ²=0.181303, weight=0.834182 2025-07-16 13:42:35,373 - INFO - log_σ² gradient: -0.429871 2025-07-16 13:42:35,450 - INFO - Optimizer step 19: log_σ²=0.181787, weight=0.833779 2025-07-16 13:42:57,504 - INFO - log_σ² gradient: -0.439032 2025-07-16 13:42:57,576 - INFO - Optimizer step 20: log_σ²=0.182272, weight=0.833375 2025-07-16 13:43:18,269 - INFO - log_σ² gradient: -0.434550 2025-07-16 13:43:18,341 - INFO - Optimizer step 21: log_σ²=0.182757, weight=0.832970 2025-07-16 13:43:39,788 - INFO - log_σ² gradient: -0.436610 2025-07-16 13:43:39,859 - INFO - Optimizer step 22: log_σ²=0.183244, weight=0.832565 2025-07-16 13:44:01,713 - INFO - log_σ² gradient: -0.437523 2025-07-16 13:44:01,783 - INFO - Optimizer step 23: log_σ²=0.183731, weight=0.832160 2025-07-16 13:44:22,897 - INFO - log_σ² gradient: -0.439075 2025-07-16 13:44:22,970 - INFO - Optimizer step 24: log_σ²=0.184220, weight=0.831753 2025-07-16 13:44:44,718 - INFO - log_σ² gradient: -0.439991 2025-07-16 13:44:44,785 - INFO - Optimizer step 25: log_σ²=0.184709, weight=0.831346 2025-07-16 13:45:06,814 - INFO - log_σ² gradient: -0.436213 2025-07-16 13:45:06,888 - INFO - Optimizer step 26: log_σ²=0.185200, weight=0.830938 2025-07-16 13:45:28,530 - INFO - log_σ² gradient: -0.432078 2025-07-16 13:45:28,605 - INFO - Optimizer step 27: log_σ²=0.185691, weight=0.830530 2025-07-16 13:45:50,622 - INFO - log_σ² gradient: -0.425228 2025-07-16 13:45:50,700 - INFO - Optimizer step 28: log_σ²=0.186182, weight=0.830123 2025-07-16 13:46:13,109 - INFO - log_σ² gradient: -0.438977 2025-07-16 13:46:13,183 - INFO - Optimizer step 29: log_σ²=0.186674, weight=0.829715 2025-07-16 13:46:34,393 - INFO - log_σ² gradient: -0.421247 2025-07-16 13:46:34,466 - INFO - Optimizer step 30: log_σ²=0.187165, weight=0.829307 2025-07-16 13:46:56,259 - INFO - log_σ² gradient: -0.435903 2025-07-16 13:46:56,333 - INFO - Optimizer step 31: log_σ²=0.187657, weight=0.828899 2025-07-16 13:47:17,619 - INFO - log_σ² gradient: -0.429825 2025-07-16 13:47:17,691 - INFO - Optimizer step 32: log_σ²=0.188149, weight=0.828491 2025-07-16 13:47:40,888 - INFO - log_σ² gradient: -0.432438 2025-07-16 13:47:40,895 - INFO - Optimizer step 33: log_σ²=0.188642, weight=0.828083 2025-07-16 13:48:00,391 - INFO - log_σ² gradient: -0.435011 2025-07-16 13:48:00,460 - INFO - Optimizer step 34: log_σ²=0.189136, weight=0.827674 2025-07-16 13:48:22,325 - INFO - log_σ² gradient: -0.422942 2025-07-16 13:48:22,397 - INFO - Optimizer step 35: log_σ²=0.189630, weight=0.827265 2025-07-16 13:48:44,018 - INFO - log_σ² gradient: -0.437439 2025-07-16 13:48:44,093 - INFO - Optimizer step 36: log_σ²=0.190125, weight=0.826856 2025-07-16 13:49:05,979 - INFO - log_σ² gradient: -0.429583 2025-07-16 13:49:06,048 - INFO - Optimizer step 37: log_σ²=0.190621, weight=0.826446 2025-07-16 13:49:27,081 - INFO - log_σ² gradient: -0.436973 2025-07-16 13:49:27,154 - INFO - Optimizer step 38: log_σ²=0.191117, weight=0.826036 2025-07-16 13:49:48,806 - INFO - log_σ² gradient: -0.434679 2025-07-16 13:49:48,880 - INFO - Optimizer step 39: log_σ²=0.191615, weight=0.825625 2025-07-16 13:50:10,279 - INFO - log_σ² gradient: -0.428832 2025-07-16 13:50:10,347 - INFO - Optimizer step 40: log_σ²=0.192113, weight=0.825214 2025-07-16 13:50:31,513 - INFO - log_σ² gradient: -0.433851 2025-07-16 13:50:31,587 - INFO - Optimizer step 41: log_σ²=0.192612, weight=0.824802 2025-07-16 13:50:52,298 - INFO - log_σ² gradient: -0.425681 2025-07-16 13:50:52,368 - INFO - Optimizer step 42: log_σ²=0.193111, weight=0.824390 2025-07-16 13:51:13,701 - INFO - log_σ² gradient: -0.425777 2025-07-16 13:51:13,774 - INFO - Optimizer step 43: log_σ²=0.193610, weight=0.823979 2025-07-16 13:51:33,890 - INFO - log_σ² gradient: -0.426215 2025-07-16 13:51:33,961 - INFO - Optimizer step 44: log_σ²=0.194109, weight=0.823568 2025-07-16 13:51:55,602 - INFO - log_σ² gradient: -0.430646 2025-07-16 13:51:55,672 - INFO - Optimizer step 45: log_σ²=0.194609, weight=0.823156 2025-07-16 13:52:17,178 - INFO - log_σ² gradient: -0.420826 2025-07-16 13:52:17,250 - INFO - Optimizer step 46: log_σ²=0.195109, weight=0.822745 2025-07-16 13:52:38,762 - INFO - log_σ² gradient: -0.423935 2025-07-16 13:52:38,835 - INFO - Optimizer step 47: log_σ²=0.195608, weight=0.822334 2025-07-16 13:53:00,381 - INFO - log_σ² gradient: -0.433324 2025-07-16 13:53:00,451 - INFO - Optimizer step 48: log_σ²=0.196109, weight=0.821922 2025-07-16 13:53:10,769 - INFO - log_σ² gradient: -0.200581 2025-07-16 13:53:10,844 - INFO - Optimizer step 49: log_σ²=0.196585, weight=0.821532 2025-07-16 13:53:11,070 - INFO - Epoch 15: Total optimizer steps: 49 2025-07-16 13:56:06,394 - INFO - Validation metrics: 2025-07-16 13:56:06,394 - INFO - Loss: 0.5594 2025-07-16 13:56:06,394 - INFO - BCE Loss: 0.4357 2025-07-16 13:56:06,394 - INFO - Weighted BCE Loss: 0.3579 2025-07-16 13:56:06,394 - INFO - Average similarity: 0.7915 2025-07-16 13:56:06,394 - INFO - Median similarity: 0.8195 2025-07-16 13:56:06,394 - INFO - Clean sample similarity: 0.7915 2025-07-16 13:56:06,394 - INFO - Corrupted sample similarity: 0.4512 2025-07-16 13:56:06,394 - INFO - Similarity gap (clean - corrupt): 0.3402 2025-07-16 13:56:06,583 - INFO - Epoch 15/30 - Train Loss: 0.5877, Val Loss: 0.5594, Val BCE: 0.4357, Val wBCE: 0.3579, Clean Sim: 0.7915, Corrupt Sim: 0.4512, Gap: 0.3402, Time: 1234.83s 2025-07-16 13:56:06,583 - INFO - New best validation loss: 0.5594 2025-07-16 13:56:09,952 - INFO - New best similarity gap: 0.3402 2025-07-16 13:56:47,552 - INFO - log_σ² gradient: -0.425235 2025-07-16 13:56:47,621 - INFO - Optimizer step 1: log_σ²=0.197063, weight=0.821139 2025-07-16 13:57:09,287 - INFO - log_σ² gradient: -0.433797 2025-07-16 13:57:09,355 - INFO - Optimizer step 2: log_σ²=0.197545, weight=0.820744 2025-07-16 13:57:30,982 - INFO - log_σ² gradient: -0.425572 2025-07-16 13:57:31,054 - INFO - Optimizer step 3: log_σ²=0.198029, weight=0.820346 2025-07-16 13:57:52,108 - INFO - log_σ² gradient: -0.427831 2025-07-16 13:57:52,177 - INFO - Optimizer step 4: log_σ²=0.198516, weight=0.819947 2025-07-16 13:58:13,606 - INFO - log_σ² gradient: -0.438208 2025-07-16 13:58:13,676 - INFO - Optimizer step 5: log_σ²=0.199006, weight=0.819545 2025-07-16 13:58:35,838 - INFO - log_σ² gradient: -0.427882 2025-07-16 13:58:35,910 - INFO - Optimizer step 6: log_σ²=0.199499, weight=0.819141 2025-07-16 13:58:57,178 - INFO - log_σ² gradient: -0.427672 2025-07-16 13:58:57,251 - INFO - Optimizer step 7: log_σ²=0.199994, weight=0.818736 2025-07-16 13:59:19,664 - INFO - log_σ² gradient: -0.426798 2025-07-16 13:59:19,737 - INFO - Optimizer step 8: log_σ²=0.200490, weight=0.818330 2025-07-16 13:59:40,394 - INFO - log_σ² gradient: -0.425209 2025-07-16 13:59:40,465 - INFO - Optimizer step 9: log_σ²=0.200988, weight=0.817922 2025-07-16 14:00:01,435 - INFO - log_σ² gradient: -0.420625 2025-07-16 14:00:01,513 - INFO - Optimizer step 10: log_σ²=0.201486, weight=0.817515 2025-07-16 14:00:24,416 - INFO - log_σ² gradient: -0.426704 2025-07-16 14:00:24,495 - INFO - Optimizer step 11: log_σ²=0.201987, weight=0.817106 2025-07-16 14:00:46,329 - INFO - log_σ² gradient: -0.431499 2025-07-16 14:00:46,403 - INFO - Optimizer step 12: log_σ²=0.202489, weight=0.816695 2025-07-16 14:01:08,310 - INFO - log_σ² gradient: -0.418883 2025-07-16 14:01:08,378 - INFO - Optimizer step 13: log_σ²=0.202992, weight=0.816285 2025-07-16 14:01:30,219 - INFO - log_σ² gradient: -0.427326 2025-07-16 14:01:30,296 - INFO - Optimizer step 14: log_σ²=0.203496, weight=0.815873 2025-07-16 14:01:52,689 - INFO - log_σ² gradient: -0.423943 2025-07-16 14:01:52,759 - INFO - Optimizer step 15: log_σ²=0.204001, weight=0.815461 2025-07-16 14:02:13,940 - INFO - log_σ² gradient: -0.428296 2025-07-16 14:02:14,007 - INFO - Optimizer step 16: log_σ²=0.204508, weight=0.815048 2025-07-16 14:02:35,356 - INFO - log_σ² gradient: -0.425683 2025-07-16 14:02:35,430 - INFO - Optimizer step 17: log_σ²=0.205016, weight=0.814634 2025-07-16 14:02:56,359 - INFO - log_σ² gradient: -0.426102 2025-07-16 14:02:56,433 - INFO - Optimizer step 18: log_σ²=0.205525, weight=0.814219 2025-07-16 14:03:18,262 - INFO - log_σ² gradient: -0.426451 2025-07-16 14:03:18,339 - INFO - Optimizer step 19: log_σ²=0.206036, weight=0.813804 2025-07-16 14:03:38,002 - INFO - log_σ² gradient: -0.425627 2025-07-16 14:03:38,070 - INFO - Optimizer step 20: log_σ²=0.206547, weight=0.813388 2025-07-16 14:03:59,384 - INFO - log_σ² gradient: -0.425830 2025-07-16 14:03:59,452 - INFO - Optimizer step 21: log_σ²=0.207059, weight=0.812971 2025-07-16 14:04:21,818 - INFO - log_σ² gradient: -0.417225 2025-07-16 14:04:21,896 - INFO - Optimizer step 22: log_σ²=0.207572, weight=0.812555 2025-07-16 14:04:43,639 - INFO - log_σ² gradient: -0.422223 2025-07-16 14:04:43,709 - INFO - Optimizer step 23: log_σ²=0.208085, weight=0.812138 2025-07-16 14:05:05,564 - INFO - log_σ² gradient: -0.426044 2025-07-16 14:05:05,637 - INFO - Optimizer step 24: log_σ²=0.208599, weight=0.811721 2025-07-16 14:05:25,991 - INFO - log_σ² gradient: -0.423950 2025-07-16 14:05:26,070 - INFO - Optimizer step 25: log_σ²=0.209114, weight=0.811303 2025-07-16 14:05:48,100 - INFO - log_σ² gradient: -0.430801 2025-07-16 14:05:48,173 - INFO - Optimizer step 26: log_σ²=0.209630, weight=0.810884 2025-07-16 14:06:09,691 - INFO - log_σ² gradient: -0.419500 2025-07-16 14:06:09,764 - INFO - Optimizer step 27: log_σ²=0.210147, weight=0.810465 2025-07-16 14:06:31,748 - INFO - log_σ² gradient: -0.425974 2025-07-16 14:06:31,825 - INFO - Optimizer step 28: log_σ²=0.210665, weight=0.810045 2025-07-16 14:06:53,215 - INFO - log_σ² gradient: -0.423583 2025-07-16 14:06:53,285 - INFO - Optimizer step 29: log_σ²=0.211184, weight=0.809625 2025-07-16 14:07:14,290 - INFO - log_σ² gradient: -0.420103 2025-07-16 14:07:14,362 - INFO - Optimizer step 30: log_σ²=0.211703, weight=0.809205 2025-07-16 14:07:35,903 - INFO - log_σ² gradient: -0.421975 2025-07-16 14:07:35,976 - INFO - Optimizer step 31: log_σ²=0.212223, weight=0.808785 2025-07-16 14:07:57,141 - INFO - log_σ² gradient: -0.419189 2025-07-16 14:07:57,215 - INFO - Optimizer step 32: log_σ²=0.212742, weight=0.808364 2025-07-16 14:08:19,065 - INFO - log_σ² gradient: -0.426154 2025-07-16 14:08:19,135 - INFO - Optimizer step 33: log_σ²=0.213264, weight=0.807943 2025-07-16 14:08:39,847 - INFO - log_σ² gradient: -0.415404 2025-07-16 14:08:39,917 - INFO - Optimizer step 34: log_σ²=0.213784, weight=0.807522 2025-07-16 14:09:01,035 - INFO - log_σ² gradient: -0.415806 2025-07-16 14:09:01,106 - INFO - Optimizer step 35: log_σ²=0.214305, weight=0.807102 2025-07-16 14:09:22,232 - INFO - log_σ² gradient: -0.417034 2025-07-16 14:09:22,303 - INFO - Optimizer step 36: log_σ²=0.214827, weight=0.806681 2025-07-16 14:09:44,750 - INFO - log_σ² gradient: -0.418794 2025-07-16 14:09:44,828 - INFO - Optimizer step 37: log_σ²=0.215348, weight=0.806261 2025-07-16 14:10:04,712 - INFO - log_σ² gradient: -0.423708 2025-07-16 14:10:04,780 - INFO - Optimizer step 38: log_σ²=0.215871, weight=0.805839 2025-07-16 14:10:26,104 - INFO - log_σ² gradient: -0.418362 2025-07-16 14:10:26,176 - INFO - Optimizer step 39: log_σ²=0.216394, weight=0.805418 2025-07-16 14:10:47,580 - INFO - log_σ² gradient: -0.417462 2025-07-16 14:10:47,648 - INFO - Optimizer step 40: log_σ²=0.216918, weight=0.804996 2025-07-16 14:11:10,243 - INFO - log_σ² gradient: -0.411095 2025-07-16 14:11:10,315 - INFO - Optimizer step 41: log_σ²=0.217441, weight=0.804575 2025-07-16 14:11:32,515 - INFO - log_σ² gradient: -0.421542 2025-07-16 14:11:32,594 - INFO - Optimizer step 42: log_σ²=0.217965, weight=0.804154 2025-07-16 14:11:53,949 - INFO - log_σ² gradient: -0.415138 2025-07-16 14:11:54,017 - INFO - Optimizer step 43: log_σ²=0.218490, weight=0.803732 2025-07-16 14:12:15,867 - INFO - log_σ² gradient: -0.422911 2025-07-16 14:12:15,941 - INFO - Optimizer step 44: log_σ²=0.219015, weight=0.803310 2025-07-16 14:12:38,082 - INFO - log_σ² gradient: -0.411661 2025-07-16 14:12:38,155 - INFO - Optimizer step 45: log_σ²=0.219541, weight=0.802887 2025-07-16 14:12:59,930 - INFO - log_σ² gradient: -0.416472 2025-07-16 14:13:00,009 - INFO - Optimizer step 46: log_σ²=0.220067, weight=0.802465 2025-07-16 14:13:21,021 - INFO - log_σ² gradient: -0.418511 2025-07-16 14:13:21,094 - INFO - Optimizer step 47: log_σ²=0.220593, weight=0.802043 2025-07-16 14:13:44,043 - INFO - log_σ² gradient: -0.419393 2025-07-16 14:13:44,113 - INFO - Optimizer step 48: log_σ²=0.221121, weight=0.801620 2025-07-16 14:13:54,074 - INFO - log_σ² gradient: -0.192322 2025-07-16 14:13:54,147 - INFO - Optimizer step 49: log_σ²=0.221621, weight=0.801219 2025-07-16 14:13:54,398 - INFO - Epoch 16: Total optimizer steps: 49 2025-07-16 14:16:49,277 - INFO - Validation metrics: 2025-07-16 14:16:49,278 - INFO - Loss: 0.5296 2025-07-16 14:16:49,278 - INFO - BCE Loss: 0.4167 2025-07-16 14:16:49,278 - INFO - Weighted BCE Loss: 0.3339 2025-07-16 14:16:49,278 - INFO - Average similarity: 0.7641 2025-07-16 14:16:49,278 - INFO - Median similarity: 0.7976 2025-07-16 14:16:49,278 - INFO - Clean sample similarity: 0.7641 2025-07-16 14:16:49,278 - INFO - Corrupted sample similarity: 0.3992 2025-07-16 14:16:49,278 - INFO - Similarity gap (clean - corrupt): 0.3649 2025-07-16 14:16:49,469 - INFO - Epoch 16/30 - Train Loss: 0.5739, Val Loss: 0.5296, Val BCE: 0.4167, Val wBCE: 0.3339, Clean Sim: 0.7641, Corrupt Sim: 0.3992, Gap: 0.3649, Time: 1233.57s 2025-07-16 14:16:49,469 - INFO - New best validation loss: 0.5296 2025-07-16 14:16:52,123 - INFO - New best similarity gap: 0.3649 2025-07-16 14:19:39,404 - INFO - Epoch 16 Validation Alignment: Pos=0.203, Neg=0.109, Gap=0.094 2025-07-16 14:20:10,077 - INFO - log_σ² gradient: -0.419445 2025-07-16 14:20:10,156 - INFO - Optimizer step 1: log_σ²=0.222125, weight=0.800815 2025-07-16 14:20:30,956 - INFO - log_σ² gradient: -0.421323 2025-07-16 14:20:31,026 - INFO - Optimizer step 2: log_σ²=0.222632, weight=0.800409 2025-07-16 14:20:52,365 - INFO - log_σ² gradient: -0.419284 2025-07-16 14:20:52,438 - INFO - Optimizer step 3: log_σ²=0.223143, weight=0.800000 2025-07-16 14:21:13,399 - INFO - log_σ² gradient: -0.420456 2025-07-16 14:21:13,473 - INFO - Optimizer step 4: log_σ²=0.223657, weight=0.799590 2025-07-16 14:21:35,290 - INFO - log_σ² gradient: -0.411817 2025-07-16 14:21:35,367 - INFO - Optimizer step 5: log_σ²=0.224172, weight=0.799178 2025-07-16 14:21:55,524 - INFO - log_σ² gradient: -0.418720 2025-07-16 14:21:55,590 - INFO - Optimizer step 6: log_σ²=0.224690, weight=0.798764 2025-07-16 14:22:17,795 - INFO - log_σ² gradient: -0.415291 2025-07-16 14:22:17,865 - INFO - Optimizer step 7: log_σ²=0.225210, weight=0.798349 2025-07-16 14:22:38,610 - INFO - log_σ² gradient: -0.413164 2025-07-16 14:22:38,678 - INFO - Optimizer step 8: log_σ²=0.225731, weight=0.797933 2025-07-16 14:23:00,870 - INFO - log_σ² gradient: -0.412953 2025-07-16 14:23:00,944 - INFO - Optimizer step 9: log_σ²=0.226254, weight=0.797516 2025-07-16 14:23:23,277 - INFO - log_σ² gradient: -0.408302 2025-07-16 14:23:23,346 - INFO - Optimizer step 10: log_σ²=0.226777, weight=0.797098 2025-07-16 14:23:45,122 - INFO - log_σ² gradient: -0.417960 2025-07-16 14:23:45,192 - INFO - Optimizer step 11: log_σ²=0.227303, weight=0.796679 2025-07-16 14:24:06,503 - INFO - log_σ² gradient: -0.407416 2025-07-16 14:24:06,581 - INFO - Optimizer step 12: log_σ²=0.227829, weight=0.796260 2025-07-16 14:24:28,808 - INFO - log_σ² gradient: -0.418163 2025-07-16 14:24:28,877 - INFO - Optimizer step 13: log_σ²=0.228357, weight=0.795840 2025-07-16 14:24:50,600 - INFO - log_σ² gradient: -0.410109 2025-07-16 14:24:50,673 - INFO - Optimizer step 14: log_σ²=0.228886, weight=0.795419 2025-07-16 14:25:12,103 - INFO - log_σ² gradient: -0.413720 2025-07-16 14:25:12,182 - INFO - Optimizer step 15: log_σ²=0.229416, weight=0.794998 2025-07-16 14:25:33,086 - INFO - log_σ² gradient: -0.417282 2025-07-16 14:25:33,173 - INFO - Optimizer step 16: log_σ²=0.229948, weight=0.794575 2025-07-16 14:25:54,057 - INFO - log_σ² gradient: -0.413599 2025-07-16 14:25:54,127 - INFO - Optimizer step 17: log_σ²=0.230481, weight=0.794151 2025-07-16 14:26:15,123 - INFO - log_σ² gradient: -0.415801 2025-07-16 14:26:15,192 - INFO - Optimizer step 18: log_σ²=0.231016, weight=0.793727 2025-07-16 14:26:37,025 - INFO - log_σ² gradient: -0.410668 2025-07-16 14:26:37,099 - INFO - Optimizer step 19: log_σ²=0.231551, weight=0.793303 2025-07-16 14:26:58,695 - INFO - log_σ² gradient: -0.415145 2025-07-16 14:26:58,766 - INFO - Optimizer step 20: log_σ²=0.232087, weight=0.792877 2025-07-16 14:27:21,048 - INFO - log_σ² gradient: -0.414649 2025-07-16 14:27:21,120 - INFO - Optimizer step 21: log_σ²=0.232624, weight=0.792451 2025-07-16 14:27:41,474 - INFO - log_σ² gradient: -0.404341 2025-07-16 14:27:41,547 - INFO - Optimizer step 22: log_σ²=0.233162, weight=0.792026 2025-07-16 14:28:04,273 - INFO - log_σ² gradient: -0.408067 2025-07-16 14:28:04,355 - INFO - Optimizer step 23: log_σ²=0.233699, weight=0.791600 2025-07-16 14:28:25,423 - INFO - log_σ² gradient: -0.407313 2025-07-16 14:28:25,493 - INFO - Optimizer step 24: log_σ²=0.234237, weight=0.791174 2025-07-16 14:28:45,857 - INFO - log_σ² gradient: -0.406918 2025-07-16 14:28:45,927 - INFO - Optimizer step 25: log_σ²=0.234775, weight=0.790749 2025-07-16 14:29:06,622 - INFO - log_σ² gradient: -0.410862 2025-07-16 14:29:06,693 - INFO - Optimizer step 26: log_σ²=0.235314, weight=0.790322 2025-07-16 14:29:27,436 - INFO - log_σ² gradient: -0.414455 2025-07-16 14:29:27,503 - INFO - Optimizer step 27: log_σ²=0.235855, weight=0.789895 2025-07-16 14:29:48,613 - INFO - log_σ² gradient: -0.405936 2025-07-16 14:29:48,688 - INFO - Optimizer step 28: log_σ²=0.236395, weight=0.789469 2025-07-16 14:30:11,752 - INFO - log_σ² gradient: -0.403278 2025-07-16 14:30:11,829 - INFO - Optimizer step 29: log_σ²=0.236936, weight=0.789042 2025-07-16 14:30:33,376 - INFO - log_σ² gradient: -0.415837 2025-07-16 14:30:33,448 - INFO - Optimizer step 30: log_σ²=0.237478, weight=0.788614 2025-07-16 14:30:55,867 - INFO - log_σ² gradient: -0.417097 2025-07-16 14:30:55,940 - INFO - Optimizer step 31: log_σ²=0.238022, weight=0.788186 2025-07-16 14:31:17,863 - INFO - log_σ² gradient: -0.409128 2025-07-16 14:31:17,942 - INFO - Optimizer step 32: log_σ²=0.238566, weight=0.787757 2025-07-16 14:31:39,864 - INFO - log_σ² gradient: -0.401862 2025-07-16 14:31:39,934 - INFO - Optimizer step 33: log_σ²=0.239110, weight=0.787328 2025-07-16 14:32:01,726 - INFO - log_σ² gradient: -0.402233 2025-07-16 14:32:01,800 - INFO - Optimizer step 34: log_σ²=0.239654, weight=0.786900 2025-07-16 14:32:23,759 - INFO - log_σ² gradient: -0.403380 2025-07-16 14:32:23,829 - INFO - Optimizer step 35: log_σ²=0.240198, weight=0.786472 2025-07-16 14:32:45,472 - INFO - log_σ² gradient: -0.407047 2025-07-16 14:32:45,546 - INFO - Optimizer step 36: log_σ²=0.240742, weight=0.786044 2025-07-16 14:33:06,845 - INFO - log_σ² gradient: -0.404792 2025-07-16 14:33:06,917 - INFO - Optimizer step 37: log_σ²=0.241287, weight=0.785616 2025-07-16 14:33:28,084 - INFO - log_σ² gradient: -0.410777 2025-07-16 14:33:28,156 - INFO - Optimizer step 38: log_σ²=0.241833, weight=0.785187 2025-07-16 14:33:50,302 - INFO - log_σ² gradient: -0.410787 2025-07-16 14:33:50,375 - INFO - Optimizer step 39: log_σ²=0.242380, weight=0.784758 2025-07-16 14:34:11,329 - INFO - log_σ² gradient: -0.412377 2025-07-16 14:34:11,396 - INFO - Optimizer step 40: log_σ²=0.242928, weight=0.784328 2025-07-16 14:34:32,529 - INFO - log_σ² gradient: -0.409479 2025-07-16 14:34:32,599 - INFO - Optimizer step 41: log_σ²=0.243478, weight=0.783897 2025-07-16 14:34:53,667 - INFO - log_σ² gradient: -0.409367 2025-07-16 14:34:53,734 - INFO - Optimizer step 42: log_σ²=0.244028, weight=0.783466 2025-07-16 14:35:15,265 - INFO - log_σ² gradient: -0.405604 2025-07-16 14:35:15,340 - INFO - Optimizer step 43: log_σ²=0.244578, weight=0.783035 2025-07-16 14:35:37,717 - INFO - log_σ² gradient: -0.408177 2025-07-16 14:35:37,795 - INFO - Optimizer step 44: log_σ²=0.245130, weight=0.782603 2025-07-16 14:36:00,228 - INFO - log_σ² gradient: -0.406198 2025-07-16 14:36:00,314 - INFO - Optimizer step 45: log_σ²=0.245681, weight=0.782171 2025-07-16 14:36:21,269 - INFO - log_σ² gradient: -0.410994 2025-07-16 14:36:21,338 - INFO - Optimizer step 46: log_σ²=0.246234, weight=0.781739 2025-07-16 14:36:41,388 - INFO - log_σ² gradient: -0.410289 2025-07-16 14:36:41,463 - INFO - Optimizer step 47: log_σ²=0.246788, weight=0.781306 2025-07-16 14:37:03,826 - INFO - log_σ² gradient: -0.405704 2025-07-16 14:37:03,904 - INFO - Optimizer step 48: log_σ²=0.247343, weight=0.780873 2025-07-16 14:37:14,005 - INFO - log_σ² gradient: -0.190466 2025-07-16 14:37:14,083 - INFO - Optimizer step 49: log_σ²=0.247868, weight=0.780463 2025-07-16 14:37:14,281 - INFO - Epoch 17: Total optimizer steps: 49 2025-07-16 14:40:09,682 - INFO - Validation metrics: 2025-07-16 14:40:09,683 - INFO - Loss: 0.5119 2025-07-16 14:40:09,683 - INFO - BCE Loss: 0.4040 2025-07-16 14:40:09,683 - INFO - Weighted BCE Loss: 0.3153 2025-07-16 14:40:09,683 - INFO - Average similarity: 0.7292 2025-07-16 14:40:09,683 - INFO - Median similarity: 0.7544 2025-07-16 14:40:09,683 - INFO - Clean sample similarity: 0.7292 2025-07-16 14:40:09,683 - INFO - Corrupted sample similarity: 0.3582 2025-07-16 14:40:09,683 - INFO - Similarity gap (clean - corrupt): 0.3710 2025-07-16 14:40:09,888 - INFO - Epoch 17/30 - Train Loss: 0.5506, Val Loss: 0.5119, Val BCE: 0.4040, Val wBCE: 0.3153, Clean Sim: 0.7292, Corrupt Sim: 0.3582, Gap: 0.3710, Time: 1230.48s 2025-07-16 14:40:09,888 - INFO - New best validation loss: 0.5119 2025-07-16 14:40:12,551 - INFO - New best similarity gap: 0.3710 2025-07-16 14:40:47,177 - INFO - log_σ² gradient: -0.407023 2025-07-16 14:40:47,245 - INFO - Optimizer step 1: log_σ²=0.248398, weight=0.780050 2025-07-16 14:41:08,386 - INFO - log_σ² gradient: -0.406007 2025-07-16 14:41:08,461 - INFO - Optimizer step 2: log_σ²=0.248930, weight=0.779635 2025-07-16 14:41:29,885 - INFO - log_σ² gradient: -0.402324 2025-07-16 14:41:29,963 - INFO - Optimizer step 3: log_σ²=0.249465, weight=0.779218 2025-07-16 14:41:51,706 - INFO - log_σ² gradient: -0.406301 2025-07-16 14:41:51,777 - INFO - Optimizer step 4: log_σ²=0.250002, weight=0.778799 2025-07-16 14:42:13,128 - INFO - log_σ² gradient: -0.413245 2025-07-16 14:42:13,199 - INFO - Optimizer step 5: log_σ²=0.250543, weight=0.778378 2025-07-16 14:42:32,836 - INFO - log_σ² gradient: -0.401389 2025-07-16 14:42:32,907 - INFO - Optimizer step 6: log_σ²=0.251086, weight=0.777955 2025-07-16 14:42:54,693 - INFO - log_σ² gradient: -0.400404 2025-07-16 14:42:54,764 - INFO - Optimizer step 7: log_σ²=0.251630, weight=0.777532 2025-07-16 14:43:16,250 - INFO - log_σ² gradient: -0.409081 2025-07-16 14:43:16,325 - INFO - Optimizer step 8: log_σ²=0.252177, weight=0.777107 2025-07-16 14:43:37,838 - INFO - log_σ² gradient: -0.402561 2025-07-16 14:43:37,918 - INFO - Optimizer step 9: log_σ²=0.252726, weight=0.776681 2025-07-16 14:43:59,122 - INFO - log_σ² gradient: -0.403130 2025-07-16 14:43:59,195 - INFO - Optimizer step 10: log_σ²=0.253275, weight=0.776254 2025-07-16 14:44:19,867 - INFO - log_σ² gradient: -0.398079 2025-07-16 14:44:19,939 - INFO - Optimizer step 11: log_σ²=0.253826, weight=0.775827 2025-07-16 14:44:41,102 - INFO - log_σ² gradient: -0.403926 2025-07-16 14:44:41,182 - INFO - Optimizer step 12: log_σ²=0.254378, weight=0.775399 2025-07-16 14:45:02,231 - INFO - log_σ² gradient: -0.395627 2025-07-16 14:45:02,301 - INFO - Optimizer step 13: log_σ²=0.254930, weight=0.774970 2025-07-16 14:45:22,178 - INFO - log_σ² gradient: -0.411258 2025-07-16 14:45:22,255 - INFO - Optimizer step 14: log_σ²=0.255485, weight=0.774540 2025-07-16 14:45:43,308 - INFO - log_σ² gradient: -0.411151 2025-07-16 14:45:43,391 - INFO - Optimizer step 15: log_σ²=0.256043, weight=0.774109 2025-07-16 14:46:04,661 - INFO - log_σ² gradient: -0.392449 2025-07-16 14:46:04,732 - INFO - Optimizer step 16: log_σ²=0.256600, weight=0.773678 2025-07-16 14:46:25,363 - INFO - log_σ² gradient: -0.395494 2025-07-16 14:46:25,443 - INFO - Optimizer step 17: log_σ²=0.257157, weight=0.773247 2025-07-16 14:46:47,755 - INFO - log_σ² gradient: -0.409800 2025-07-16 14:46:47,832 - INFO - Optimizer step 18: log_σ²=0.257716, weight=0.772815 2025-07-16 14:47:09,043 - INFO - log_σ² gradient: -0.395376 2025-07-16 14:47:09,113 - INFO - Optimizer step 19: log_σ²=0.258275, weight=0.772382 2025-07-16 14:47:30,377 - INFO - log_σ² gradient: -0.399075 2025-07-16 14:47:30,450 - INFO - Optimizer step 20: log_σ²=0.258835, weight=0.771950 2025-07-16 14:47:51,574 - INFO - log_σ² gradient: -0.390103 2025-07-16 14:47:51,647 - INFO - Optimizer step 21: log_σ²=0.259395, weight=0.771518 2025-07-16 14:48:13,559 - INFO - log_σ² gradient: -0.401222 2025-07-16 14:48:13,632 - INFO - Optimizer step 22: log_σ²=0.259956, weight=0.771086 2025-07-16 14:48:35,326 - INFO - log_σ² gradient: -0.400629 2025-07-16 14:48:35,404 - INFO - Optimizer step 23: log_σ²=0.260517, weight=0.770653 2025-07-16 14:48:57,326 - INFO - log_σ² gradient: -0.400779 2025-07-16 14:48:57,396 - INFO - Optimizer step 24: log_σ²=0.261080, weight=0.770220 2025-07-16 14:49:19,330 - INFO - log_σ² gradient: -0.399401 2025-07-16 14:49:19,408 - INFO - Optimizer step 25: log_σ²=0.261643, weight=0.769786 2025-07-16 14:49:41,515 - INFO - log_σ² gradient: -0.396396 2025-07-16 14:49:41,592 - INFO - Optimizer step 26: log_σ²=0.262207, weight=0.769352 2025-07-16 14:50:02,673 - INFO - log_σ² gradient: -0.396051 2025-07-16 14:50:02,741 - INFO - Optimizer step 27: log_σ²=0.262771, weight=0.768918 2025-07-16 14:50:25,407 - INFO - log_σ² gradient: -0.395828 2025-07-16 14:50:25,477 - INFO - Optimizer step 28: log_σ²=0.263335, weight=0.768484 2025-07-16 14:50:46,549 - INFO - log_σ² gradient: -0.396303 2025-07-16 14:50:46,623 - INFO - Optimizer step 29: log_σ²=0.263900, weight=0.768050 2025-07-16 14:51:06,974 - INFO - log_σ² gradient: -0.394285 2025-07-16 14:51:07,044 - INFO - Optimizer step 30: log_σ²=0.264465, weight=0.767616 2025-07-16 14:51:29,132 - INFO - log_σ² gradient: -0.395840 2025-07-16 14:51:29,209 - INFO - Optimizer step 31: log_σ²=0.265031, weight=0.767182 2025-07-16 14:51:50,107 - INFO - log_σ² gradient: -0.404813 2025-07-16 14:51:50,185 - INFO - Optimizer step 32: log_σ²=0.265599, weight=0.766747 2025-07-16 14:52:11,731 - INFO - log_σ² gradient: -0.400179 2025-07-16 14:52:11,804 - INFO - Optimizer step 33: log_σ²=0.266167, weight=0.766311 2025-07-16 14:52:33,396 - INFO - log_σ² gradient: -0.396263 2025-07-16 14:52:33,468 - INFO - Optimizer step 34: log_σ²=0.266736, weight=0.765875 2025-07-16 14:52:55,117 - INFO - log_σ² gradient: -0.394935 2025-07-16 14:52:55,189 - INFO - Optimizer step 35: log_σ²=0.267305, weight=0.765439 2025-07-16 14:53:16,692 - INFO - log_σ² gradient: -0.400253 2025-07-16 14:53:16,770 - INFO - Optimizer step 36: log_σ²=0.267876, weight=0.765003 2025-07-16 14:53:39,402 - INFO - log_σ² gradient: -0.390091 2025-07-16 14:53:39,475 - INFO - Optimizer step 37: log_σ²=0.268446, weight=0.764567 2025-07-16 14:54:00,618 - INFO - log_σ² gradient: -0.405599 2025-07-16 14:54:00,690 - INFO - Optimizer step 38: log_σ²=0.269018, weight=0.764129 2025-07-16 14:54:21,433 - INFO - log_σ² gradient: -0.396703 2025-07-16 14:54:21,512 - INFO - Optimizer step 39: log_σ²=0.269591, weight=0.763692 2025-07-16 14:54:41,874 - INFO - log_σ² gradient: -0.402377 2025-07-16 14:54:41,947 - INFO - Optimizer step 40: log_σ²=0.270165, weight=0.763254 2025-07-16 14:55:03,144 - INFO - log_σ² gradient: -0.394951 2025-07-16 14:55:03,214 - INFO - Optimizer step 41: log_σ²=0.270739, weight=0.762815 2025-07-16 14:55:25,720 - INFO - log_σ² gradient: -0.400806 2025-07-16 14:55:25,800 - INFO - Optimizer step 42: log_σ²=0.271315, weight=0.762376 2025-07-16 14:55:47,072 - INFO - log_σ² gradient: -0.395776 2025-07-16 14:55:47,150 - INFO - Optimizer step 43: log_σ²=0.271891, weight=0.761937 2025-07-16 14:56:06,103 - INFO - log_σ² gradient: -0.391584 2025-07-16 14:56:06,176 - INFO - Optimizer step 44: log_σ²=0.272467, weight=0.761499 2025-07-16 14:56:27,517 - INFO - log_σ² gradient: -0.403189 2025-07-16 14:56:27,585 - INFO - Optimizer step 45: log_σ²=0.273044, weight=0.761059 2025-07-16 14:56:48,495 - INFO - log_σ² gradient: -0.396396 2025-07-16 14:56:48,566 - INFO - Optimizer step 46: log_σ²=0.273622, weight=0.760619 2025-07-16 14:57:10,640 - INFO - log_σ² gradient: -0.391835 2025-07-16 14:57:10,720 - INFO - Optimizer step 47: log_σ²=0.274200, weight=0.760180 2025-07-16 14:57:31,167 - INFO - log_σ² gradient: -0.395192 2025-07-16 14:57:31,242 - INFO - Optimizer step 48: log_σ²=0.274779, weight=0.759740 2025-07-16 14:57:40,603 - INFO - log_σ² gradient: -0.183835 2025-07-16 14:57:40,677 - INFO - Optimizer step 49: log_σ²=0.275327, weight=0.759324 2025-07-16 14:57:40,878 - INFO - Epoch 18: Total optimizer steps: 49 2025-07-16 15:00:35,864 - INFO - Validation metrics: 2025-07-16 15:00:35,864 - INFO - Loss: 0.4956 2025-07-16 15:00:35,864 - INFO - BCE Loss: 0.3932 2025-07-16 15:00:35,864 - INFO - Weighted BCE Loss: 0.2985 2025-07-16 15:00:35,864 - INFO - Average similarity: 0.7238 2025-07-16 15:00:35,864 - INFO - Median similarity: 0.7532 2025-07-16 15:00:35,864 - INFO - Clean sample similarity: 0.7238 2025-07-16 15:00:35,864 - INFO - Corrupted sample similarity: 0.3340 2025-07-16 15:00:35,864 - INFO - Similarity gap (clean - corrupt): 0.3898 2025-07-16 15:00:36,062 - INFO - Epoch 18/30 - Train Loss: 0.5369, Val Loss: 0.4956, Val BCE: 0.3932, Val wBCE: 0.2985, Clean Sim: 0.7238, Corrupt Sim: 0.3340, Gap: 0.3898, Time: 1220.11s 2025-07-16 15:00:36,062 - INFO - New best validation loss: 0.4956 2025-07-16 15:00:39,415 - INFO - New best similarity gap: 0.3898 2025-07-16 15:03:26,220 - INFO - Epoch 18 Validation Alignment: Pos=0.186, Neg=0.094, Gap=0.092 2025-07-16 15:03:59,071 - INFO - log_σ² gradient: -0.392702 2025-07-16 15:03:59,143 - INFO - Optimizer step 1: log_σ²=0.275879, weight=0.758905 2025-07-16 15:04:22,137 - INFO - log_σ² gradient: -0.391523 2025-07-16 15:04:22,215 - INFO - Optimizer step 2: log_σ²=0.276433, weight=0.758484 2025-07-16 15:04:43,355 - INFO - log_σ² gradient: -0.388936 2025-07-16 15:04:43,428 - INFO - Optimizer step 3: log_σ²=0.276990, weight=0.758062 2025-07-16 15:05:05,155 - INFO - log_σ² gradient: -0.385084 2025-07-16 15:05:05,233 - INFO - Optimizer step 4: log_σ²=0.277548, weight=0.757639 2025-07-16 15:05:26,676 - INFO - log_σ² gradient: -0.388416 2025-07-16 15:05:26,744 - INFO - Optimizer step 5: log_σ²=0.278109, weight=0.757215 2025-07-16 15:05:48,932 - INFO - log_σ² gradient: -0.383623 2025-07-16 15:05:49,012 - INFO - Optimizer step 6: log_σ²=0.278670, weight=0.756790 2025-07-16 15:06:08,957 - INFO - log_σ² gradient: -0.390997 2025-07-16 15:06:09,031 - INFO - Optimizer step 7: log_σ²=0.279234, weight=0.756363 2025-07-16 15:06:30,098 - INFO - log_σ² gradient: -0.394506 2025-07-16 15:06:30,167 - INFO - Optimizer step 8: log_σ²=0.279800, weight=0.755935 2025-07-16 15:06:51,644 - INFO - log_σ² gradient: -0.392829 2025-07-16 15:06:51,715 - INFO - Optimizer step 9: log_σ²=0.280368, weight=0.755505 2025-07-16 15:07:13,482 - INFO - log_σ² gradient: -0.392308 2025-07-16 15:07:13,552 - INFO - Optimizer step 10: log_σ²=0.280939, weight=0.755075 2025-07-16 15:07:34,666 - INFO - log_σ² gradient: -0.389986 2025-07-16 15:07:34,739 - INFO - Optimizer step 11: log_σ²=0.281511, weight=0.754643 2025-07-16 15:07:56,630 - INFO - log_σ² gradient: -0.389816 2025-07-16 15:07:56,699 - INFO - Optimizer step 12: log_σ²=0.282084, weight=0.754210 2025-07-16 15:08:19,141 - INFO - log_σ² gradient: -0.389856 2025-07-16 15:08:19,227 - INFO - Optimizer step 13: log_σ²=0.282658, weight=0.753777 2025-07-16 15:08:41,147 - INFO - log_σ² gradient: -0.383052 2025-07-16 15:08:41,233 - INFO - Optimizer step 14: log_σ²=0.283233, weight=0.753344 2025-07-16 15:09:02,797 - INFO - log_σ² gradient: -0.387525 2025-07-16 15:09:02,870 - INFO - Optimizer step 15: log_σ²=0.283809, weight=0.752910 2025-07-16 15:09:24,231 - INFO - log_σ² gradient: -0.389631 2025-07-16 15:09:24,303 - INFO - Optimizer step 16: log_σ²=0.284386, weight=0.752476 2025-07-16 15:09:45,206 - INFO - log_σ² gradient: -0.386718 2025-07-16 15:09:45,276 - INFO - Optimizer step 17: log_σ²=0.284964, weight=0.752041 2025-07-16 15:10:06,925 - INFO - log_σ² gradient: -0.382677 2025-07-16 15:10:06,994 - INFO - Optimizer step 18: log_σ²=0.285543, weight=0.751606 2025-07-16 15:10:28,043 - INFO - log_σ² gradient: -0.390111 2025-07-16 15:10:28,112 - INFO - Optimizer step 19: log_σ²=0.286122, weight=0.751171 2025-07-16 15:10:49,549 - INFO - log_σ² gradient: -0.386293 2025-07-16 15:10:49,632 - INFO - Optimizer step 20: log_σ²=0.286703, weight=0.750735 2025-07-16 15:11:10,015 - INFO - log_σ² gradient: -0.394482 2025-07-16 15:11:10,088 - INFO - Optimizer step 21: log_σ²=0.287285, weight=0.750298 2025-07-16 15:11:32,803 - INFO - log_σ² gradient: -0.394727 2025-07-16 15:11:32,876 - INFO - Optimizer step 22: log_σ²=0.287869, weight=0.749860 2025-07-16 15:11:53,814 - INFO - log_σ² gradient: -0.386553 2025-07-16 15:11:53,892 - INFO - Optimizer step 23: log_σ²=0.288454, weight=0.749421 2025-07-16 15:12:15,013 - INFO - log_σ² gradient: -0.382538 2025-07-16 15:12:15,083 - INFO - Optimizer step 24: log_σ²=0.289039, weight=0.748983 2025-07-16 15:12:36,506 - INFO - log_σ² gradient: -0.388951 2025-07-16 15:12:36,579 - INFO - Optimizer step 25: log_σ²=0.289625, weight=0.748545 2025-07-16 15:12:58,177 - INFO - log_σ² gradient: -0.386150 2025-07-16 15:12:58,254 - INFO - Optimizer step 26: log_σ²=0.290211, weight=0.748106 2025-07-16 15:13:20,236 - INFO - log_σ² gradient: -0.386759 2025-07-16 15:13:20,310 - INFO - Optimizer step 27: log_σ²=0.290798, weight=0.747667 2025-07-16 15:13:42,232 - INFO - log_σ² gradient: -0.382922 2025-07-16 15:13:42,309 - INFO - Optimizer step 28: log_σ²=0.291385, weight=0.747228 2025-07-16 15:14:03,526 - INFO - log_σ² gradient: -0.383719 2025-07-16 15:14:03,599 - INFO - Optimizer step 29: log_σ²=0.291973, weight=0.746789 2025-07-16 15:14:26,747 - INFO - log_σ² gradient: -0.382611 2025-07-16 15:14:26,820 - INFO - Optimizer step 30: log_σ²=0.292561, weight=0.746350 2025-07-16 15:14:47,576 - INFO - log_σ² gradient: -0.387137 2025-07-16 15:14:47,651 - INFO - Optimizer step 31: log_σ²=0.293150, weight=0.745910 2025-07-16 15:15:09,928 - INFO - log_σ² gradient: -0.382994 2025-07-16 15:15:10,003 - INFO - Optimizer step 32: log_σ²=0.293739, weight=0.745471 2025-07-16 15:15:32,163 - INFO - log_σ² gradient: -0.383364 2025-07-16 15:15:32,237 - INFO - Optimizer step 33: log_σ²=0.294328, weight=0.745032 2025-07-16 15:15:53,941 - INFO - log_σ² gradient: -0.383505 2025-07-16 15:15:54,011 - INFO - Optimizer step 34: log_σ²=0.294918, weight=0.744592 2025-07-16 15:16:16,045 - INFO - log_σ² gradient: -0.385393 2025-07-16 15:16:16,122 - INFO - Optimizer step 35: log_σ²=0.295509, weight=0.744153 2025-07-16 15:16:36,213 - INFO - log_σ² gradient: -0.380928 2025-07-16 15:16:36,281 - INFO - Optimizer step 36: log_σ²=0.296100, weight=0.743713 2025-07-16 15:16:57,037 - INFO - log_σ² gradient: -0.384225 2025-07-16 15:16:57,116 - INFO - Optimizer step 37: log_σ²=0.296692, weight=0.743273 2025-07-16 15:17:18,879 - INFO - log_σ² gradient: -0.383455 2025-07-16 15:17:18,949 - INFO - Optimizer step 38: log_σ²=0.297284, weight=0.742833 2025-07-16 15:17:40,171 - INFO - log_σ² gradient: -0.385913 2025-07-16 15:17:40,247 - INFO - Optimizer step 39: log_σ²=0.297877, weight=0.742393 2025-07-16 15:18:02,199 - INFO - log_σ² gradient: -0.378584 2025-07-16 15:18:02,271 - INFO - Optimizer step 40: log_σ²=0.298470, weight=0.741953 2025-07-16 15:18:24,643 - INFO - log_σ² gradient: -0.387664 2025-07-16 15:18:24,717 - INFO - Optimizer step 41: log_σ²=0.299064, weight=0.741512 2025-07-16 15:18:45,029 - INFO - log_σ² gradient: -0.393240 2025-07-16 15:18:45,104 - INFO - Optimizer step 42: log_σ²=0.299661, weight=0.741070 2025-07-16 15:19:05,739 - INFO - log_σ² gradient: -0.375886 2025-07-16 15:19:05,812 - INFO - Optimizer step 43: log_σ²=0.300256, weight=0.740628 2025-07-16 15:19:27,588 - INFO - log_σ² gradient: -0.381500 2025-07-16 15:19:27,667 - INFO - Optimizer step 44: log_σ²=0.300853, weight=0.740187 2025-07-16 15:19:49,066 - INFO - log_σ² gradient: -0.388855 2025-07-16 15:19:49,137 - INFO - Optimizer step 45: log_σ²=0.301450, weight=0.739745 2025-07-16 15:20:09,457 - INFO - log_σ² gradient: -0.389045 2025-07-16 15:20:09,525 - INFO - Optimizer step 46: log_σ²=0.302049, weight=0.739302 2025-07-16 15:20:30,807 - INFO - log_σ² gradient: -0.379922 2025-07-16 15:20:30,880 - INFO - Optimizer step 47: log_σ²=0.302649, weight=0.738859 2025-07-16 15:20:52,888 - INFO - log_σ² gradient: -0.380179 2025-07-16 15:20:52,958 - INFO - Optimizer step 48: log_σ²=0.303248, weight=0.738416 2025-07-16 15:21:02,534 - INFO - log_σ² gradient: -0.182462 2025-07-16 15:21:02,613 - INFO - Optimizer step 49: log_σ²=0.303817, weight=0.737996 2025-07-16 15:21:02,797 - INFO - Epoch 19: Total optimizer steps: 49 2025-07-16 15:23:59,069 - INFO - Validation metrics: 2025-07-16 15:23:59,070 - INFO - Loss: 0.4808 2025-07-16 15:23:59,070 - INFO - BCE Loss: 0.3791 2025-07-16 15:23:59,070 - INFO - Weighted BCE Loss: 0.2797 2025-07-16 15:23:59,070 - INFO - Average similarity: 0.6730 2025-07-16 15:23:59,070 - INFO - Median similarity: 0.6945 2025-07-16 15:23:59,070 - INFO - Clean sample similarity: 0.6730 2025-07-16 15:23:59,070 - INFO - Corrupted sample similarity: 0.3189 2025-07-16 15:23:59,070 - INFO - Similarity gap (clean - corrupt): 0.3541 2025-07-16 15:23:59,290 - INFO - Epoch 19/30 - Train Loss: 0.5264, Val Loss: 0.4808, Val BCE: 0.3791, Val wBCE: 0.2797, Clean Sim: 0.6730, Corrupt Sim: 0.3189, Gap: 0.3541, Time: 1233.07s 2025-07-16 15:23:59,290 - INFO - New best validation loss: 0.4808 2025-07-16 15:24:34,754 - INFO - log_σ² gradient: -0.384181 2025-07-16 15:24:34,827 - INFO - Optimizer step 1: log_σ²=0.304389, weight=0.737574 2025-07-16 15:24:55,505 - INFO - log_σ² gradient: -0.371162 2025-07-16 15:24:55,583 - INFO - Optimizer step 2: log_σ²=0.304964, weight=0.737150 2025-07-16 15:25:17,576 - INFO - log_σ² gradient: -0.383164 2025-07-16 15:25:17,653 - INFO - Optimizer step 3: log_σ²=0.305541, weight=0.736724 2025-07-16 15:25:38,194 - INFO - log_σ² gradient: -0.387487 2025-07-16 15:25:38,266 - INFO - Optimizer step 4: log_σ²=0.306123, weight=0.736296 2025-07-16 15:26:00,588 - INFO - log_σ² gradient: -0.383919 2025-07-16 15:26:00,660 - INFO - Optimizer step 5: log_σ²=0.306707, weight=0.735866 2025-07-16 15:26:23,268 - INFO - log_σ² gradient: -0.383353 2025-07-16 15:26:23,338 - INFO - Optimizer step 6: log_σ²=0.307294, weight=0.735434 2025-07-16 15:26:45,191 - INFO - log_σ² gradient: -0.386693 2025-07-16 15:26:45,261 - INFO - Optimizer step 7: log_σ²=0.307884, weight=0.735000 2025-07-16 15:27:07,340 - INFO - log_σ² gradient: -0.386907 2025-07-16 15:27:07,415 - INFO - Optimizer step 8: log_σ²=0.308477, weight=0.734565 2025-07-16 15:27:29,519 - INFO - log_σ² gradient: -0.383688 2025-07-16 15:27:29,593 - INFO - Optimizer step 9: log_σ²=0.309072, weight=0.734128 2025-07-16 15:27:49,987 - INFO - log_σ² gradient: -0.377560 2025-07-16 15:27:50,062 - INFO - Optimizer step 10: log_σ²=0.309669, weight=0.733690 2025-07-16 15:28:11,809 - INFO - log_σ² gradient: -0.376556 2025-07-16 15:28:11,883 - INFO - Optimizer step 11: log_σ²=0.310265, weight=0.733252 2025-07-16 15:28:32,804 - INFO - log_σ² gradient: -0.379526 2025-07-16 15:28:32,878 - INFO - Optimizer step 12: log_σ²=0.310864, weight=0.732814 2025-07-16 15:28:53,819 - INFO - log_σ² gradient: -0.383220 2025-07-16 15:28:53,889 - INFO - Optimizer step 13: log_σ²=0.311464, weight=0.732374 2025-07-16 15:29:15,640 - INFO - log_σ² gradient: -0.368744 2025-07-16 15:29:15,719 - INFO - Optimizer step 14: log_σ²=0.312063, weight=0.731935 2025-07-16 15:29:37,287 - INFO - log_σ² gradient: -0.374851 2025-07-16 15:29:37,359 - INFO - Optimizer step 15: log_σ²=0.312663, weight=0.731496 2025-07-16 15:29:58,101 - INFO - log_σ² gradient: -0.382051 2025-07-16 15:29:58,179 - INFO - Optimizer step 16: log_σ²=0.313265, weight=0.731056 2025-07-16 15:30:19,667 - INFO - log_σ² gradient: -0.371261 2025-07-16 15:30:19,738 - INFO - Optimizer step 17: log_σ²=0.313867, weight=0.730616 2025-07-16 15:30:41,558 - INFO - log_σ² gradient: -0.378201 2025-07-16 15:30:41,633 - INFO - Optimizer step 18: log_σ²=0.314470, weight=0.730176 2025-07-16 15:31:02,912 - INFO - log_σ² gradient: -0.378018 2025-07-16 15:31:02,986 - INFO - Optimizer step 19: log_σ²=0.315074, weight=0.729735 2025-07-16 15:31:25,573 - INFO - log_σ² gradient: -0.372481 2025-07-16 15:31:25,644 - INFO - Optimizer step 20: log_σ²=0.315678, weight=0.729294 2025-07-16 15:31:48,605 - INFO - log_σ² gradient: -0.372538 2025-07-16 15:31:48,687 - INFO - Optimizer step 21: log_σ²=0.316282, weight=0.728854 2025-07-16 15:32:10,266 - INFO - log_σ² gradient: -0.374285 2025-07-16 15:32:10,337 - INFO - Optimizer step 22: log_σ²=0.316887, weight=0.728413 2025-07-16 15:32:31,821 - INFO - log_σ² gradient: -0.372486 2025-07-16 15:32:31,899 - INFO - Optimizer step 23: log_σ²=0.317492, weight=0.727973 2025-07-16 15:32:53,111 - INFO - log_σ² gradient: -0.374366 2025-07-16 15:32:53,186 - INFO - Optimizer step 24: log_σ²=0.318098, weight=0.727532 2025-07-16 15:33:13,705 - INFO - log_σ² gradient: -0.378933 2025-07-16 15:33:13,775 - INFO - Optimizer step 25: log_σ²=0.318705, weight=0.727090 2025-07-16 15:33:35,515 - INFO - log_σ² gradient: -0.372865 2025-07-16 15:33:35,586 - INFO - Optimizer step 26: log_σ²=0.319312, weight=0.726649 2025-07-16 15:33:56,586 - INFO - log_σ² gradient: -0.372495 2025-07-16 15:33:56,655 - INFO - Optimizer step 27: log_σ²=0.319920, weight=0.726207 2025-07-16 15:34:18,724 - INFO - log_σ² gradient: -0.373290 2025-07-16 15:34:18,801 - INFO - Optimizer step 28: log_σ²=0.320528, weight=0.725765 2025-07-16 15:34:39,685 - INFO - log_σ² gradient: -0.378182 2025-07-16 15:34:39,755 - INFO - Optimizer step 29: log_σ²=0.321138, weight=0.725323 2025-07-16 15:35:00,423 - INFO - log_σ² gradient: -0.378089 2025-07-16 15:35:00,490 - INFO - Optimizer step 30: log_σ²=0.321749, weight=0.724880 2025-07-16 15:35:22,416 - INFO - log_σ² gradient: -0.376166 2025-07-16 15:35:22,488 - INFO - Optimizer step 31: log_σ²=0.322361, weight=0.724437 2025-07-16 15:35:43,133 - INFO - log_σ² gradient: -0.373549 2025-07-16 15:35:43,207 - INFO - Optimizer step 32: log_σ²=0.322973, weight=0.723993 2025-07-16 15:36:05,101 - INFO - log_σ² gradient: -0.378138 2025-07-16 15:36:05,180 - INFO - Optimizer step 33: log_σ²=0.323587, weight=0.723549 2025-07-16 15:36:26,379 - INFO - log_σ² gradient: -0.373722 2025-07-16 15:36:26,451 - INFO - Optimizer step 34: log_σ²=0.324201, weight=0.723105 2025-07-16 15:36:48,264 - INFO - log_σ² gradient: -0.369591 2025-07-16 15:36:48,334 - INFO - Optimizer step 35: log_σ²=0.324815, weight=0.722661 2025-07-16 15:37:09,766 - INFO - log_σ² gradient: -0.370282 2025-07-16 15:37:09,839 - INFO - Optimizer step 36: log_σ²=0.325429, weight=0.722217 2025-07-16 15:37:30,383 - INFO - log_σ² gradient: -0.373093 2025-07-16 15:37:30,460 - INFO - Optimizer step 37: log_σ²=0.326044, weight=0.721773 2025-07-16 15:37:51,511 - INFO - log_σ² gradient: -0.374500 2025-07-16 15:37:51,583 - INFO - Optimizer step 38: log_σ²=0.326659, weight=0.721329 2025-07-16 15:38:13,669 - INFO - log_σ² gradient: -0.364883 2025-07-16 15:38:13,743 - INFO - Optimizer step 39: log_σ²=0.327274, weight=0.720886 2025-07-16 15:38:35,964 - INFO - log_σ² gradient: -0.379535 2025-07-16 15:38:36,039 - INFO - Optimizer step 40: log_σ²=0.327891, weight=0.720442 2025-07-16 15:38:57,420 - INFO - log_σ² gradient: -0.370254 2025-07-16 15:38:57,499 - INFO - Optimizer step 41: log_σ²=0.328508, weight=0.719997 2025-07-16 15:39:19,961 - INFO - log_σ² gradient: -0.374996 2025-07-16 15:39:20,038 - INFO - Optimizer step 42: log_σ²=0.329126, weight=0.719553 2025-07-16 15:39:41,165 - INFO - log_σ² gradient: -0.377377 2025-07-16 15:39:41,240 - INFO - Optimizer step 43: log_σ²=0.329745, weight=0.719107 2025-07-16 15:40:03,279 - INFO - log_σ² gradient: -0.364096 2025-07-16 15:40:03,351 - INFO - Optimizer step 44: log_σ²=0.330363, weight=0.718663 2025-07-16 15:40:24,665 - INFO - log_σ² gradient: -0.363745 2025-07-16 15:40:24,740 - INFO - Optimizer step 45: log_σ²=0.330981, weight=0.718219 2025-07-16 15:40:46,560 - INFO - log_σ² gradient: -0.368351 2025-07-16 15:40:46,634 - INFO - Optimizer step 46: log_σ²=0.331599, weight=0.717775 2025-07-16 15:41:08,145 - INFO - log_σ² gradient: -0.366137 2025-07-16 15:41:08,218 - INFO - Optimizer step 47: log_σ²=0.332216, weight=0.717332 2025-07-16 15:41:29,824 - INFO - log_σ² gradient: -0.372629 2025-07-16 15:41:29,897 - INFO - Optimizer step 48: log_σ²=0.332835, weight=0.716888 2025-07-16 15:41:39,416 - INFO - log_σ² gradient: -0.175031 2025-07-16 15:41:39,492 - INFO - Optimizer step 49: log_σ²=0.333422, weight=0.716468 2025-07-16 15:41:39,682 - INFO - Epoch 20: Total optimizer steps: 49 2025-07-16 15:44:34,788 - INFO - Validation metrics: 2025-07-16 15:44:34,788 - INFO - Loss: 0.4740 2025-07-16 15:44:34,788 - INFO - BCE Loss: 0.3725 2025-07-16 15:44:34,788 - INFO - Weighted BCE Loss: 0.2669 2025-07-16 15:44:34,788 - INFO - Average similarity: 0.7354 2025-07-16 15:44:34,788 - INFO - Median similarity: 0.7528 2025-07-16 15:44:34,788 - INFO - Clean sample similarity: 0.7354 2025-07-16 15:44:34,788 - INFO - Corrupted sample similarity: 0.3733 2025-07-16 15:44:34,788 - INFO - Similarity gap (clean - corrupt): 0.3621 2025-07-16 15:44:34,996 - INFO - Epoch 20/30 - Train Loss: 0.5069, Val Loss: 0.4740, Val BCE: 0.3725, Val wBCE: 0.2669, Clean Sim: 0.7354, Corrupt Sim: 0.3733, Gap: 0.3621, Time: 1232.30s 2025-07-16 15:44:34,996 - INFO - New best validation loss: 0.4740 2025-07-16 15:47:21,629 - INFO - Epoch 20 Validation Alignment: Pos=0.225, Neg=0.118, Gap=0.108 2025-07-16 15:47:53,511 - INFO - log_σ² gradient: -0.367457 2025-07-16 15:47:53,589 - INFO - Optimizer step 1: log_σ²=0.334013, weight=0.716045 2025-07-16 15:48:15,592 - INFO - log_σ² gradient: -0.371549 2025-07-16 15:48:15,663 - INFO - Optimizer step 2: log_σ²=0.334607, weight=0.715620 2025-07-16 15:48:37,056 - INFO - log_σ² gradient: -0.366998 2025-07-16 15:48:37,129 - INFO - Optimizer step 3: log_σ²=0.335204, weight=0.715192 2025-07-16 15:48:58,434 - INFO - log_σ² gradient: -0.373644 2025-07-16 15:48:58,509 - INFO - Optimizer step 4: log_σ²=0.335804, weight=0.714763 2025-07-16 15:49:20,978 - INFO - log_σ² gradient: -0.368144 2025-07-16 15:49:21,052 - INFO - Optimizer step 5: log_σ²=0.336407, weight=0.714332 2025-07-16 15:49:42,113 - INFO - log_σ² gradient: -0.356780 2025-07-16 15:49:42,185 - INFO - Optimizer step 6: log_σ²=0.337011, weight=0.713901 2025-07-16 15:50:03,610 - INFO - log_σ² gradient: -0.362219 2025-07-16 15:50:03,677 - INFO - Optimizer step 7: log_σ²=0.337616, weight=0.713469 2025-07-16 15:50:24,290 - INFO - log_σ² gradient: -0.365981 2025-07-16 15:50:24,360 - INFO - Optimizer step 8: log_σ²=0.338223, weight=0.713036 2025-07-16 15:50:46,402 - INFO - log_σ² gradient: -0.367303 2025-07-16 15:50:46,474 - INFO - Optimizer step 9: log_σ²=0.338832, weight=0.712602 2025-07-16 15:51:09,951 - INFO - log_σ² gradient: -0.365434 2025-07-16 15:51:10,021 - INFO - Optimizer step 10: log_σ²=0.339442, weight=0.712167 2025-07-16 15:51:32,902 - INFO - log_σ² gradient: -0.366009 2025-07-16 15:51:32,976 - INFO - Optimizer step 11: log_σ²=0.340055, weight=0.711731 2025-07-16 15:51:54,049 - INFO - log_σ² gradient: -0.360898 2025-07-16 15:51:54,121 - INFO - Optimizer step 12: log_σ²=0.340668, weight=0.711295 2025-07-16 15:52:15,607 - INFO - log_σ² gradient: -0.359340 2025-07-16 15:52:15,685 - INFO - Optimizer step 13: log_σ²=0.341281, weight=0.710859 2025-07-16 15:52:38,974 - INFO - log_σ² gradient: -0.371214 2025-07-16 15:52:39,053 - INFO - Optimizer step 14: log_σ²=0.341897, weight=0.710422 2025-07-16 15:53:01,602 - INFO - log_σ² gradient: -0.358314 2025-07-16 15:53:01,673 - INFO - Optimizer step 15: log_σ²=0.342513, weight=0.709984 2025-07-16 15:53:25,039 - INFO - log_σ² gradient: -0.368872 2025-07-16 15:53:25,118 - INFO - Optimizer step 16: log_σ²=0.343131, weight=0.709545 2025-07-16 15:53:48,085 - INFO - log_σ² gradient: -0.365555 2025-07-16 15:53:48,155 - INFO - Optimizer step 17: log_σ²=0.343750, weight=0.709106 2025-07-16 15:54:10,611 - INFO - log_σ² gradient: -0.361412 2025-07-16 15:54:10,686 - INFO - Optimizer step 18: log_σ²=0.344370, weight=0.708667 2025-07-16 15:54:32,252 - INFO - log_σ² gradient: -0.366049 2025-07-16 15:54:32,322 - INFO - Optimizer step 19: log_σ²=0.344991, weight=0.708227 2025-07-16 15:54:54,147 - INFO - log_σ² gradient: -0.365557 2025-07-16 15:54:54,222 - INFO - Optimizer step 20: log_σ²=0.345613, weight=0.707786 2025-07-16 15:55:16,021 - INFO - log_σ² gradient: -0.365817 2025-07-16 15:55:16,092 - INFO - Optimizer step 21: log_σ²=0.346237, weight=0.707345 2025-07-16 15:55:37,624 - INFO - log_σ² gradient: -0.362222 2025-07-16 15:55:37,697 - INFO - Optimizer step 22: log_σ²=0.346859, weight=0.706905 2025-07-16 15:56:00,392 - INFO - log_σ² gradient: -0.358963 2025-07-16 15:56:00,463 - INFO - Optimizer step 23: log_σ²=0.347480, weight=0.706466 2025-07-16 15:56:22,445 - INFO - log_σ² gradient: -0.358410 2025-07-16 15:56:22,515 - INFO - Optimizer step 24: log_σ²=0.348098, weight=0.706029 2025-07-16 15:56:42,993 - INFO - log_σ² gradient: -0.364271 2025-07-16 15:56:43,061 - INFO - Optimizer step 25: log_σ²=0.348716, weight=0.705593 2025-07-16 15:57:04,380 - INFO - log_σ² gradient: -0.359729 2025-07-16 15:57:04,451 - INFO - Optimizer step 26: log_σ²=0.349332, weight=0.705159 2025-07-16 15:57:27,074 - INFO - log_σ² gradient: -0.366655 2025-07-16 15:57:27,142 - INFO - Optimizer step 27: log_σ²=0.349948, weight=0.704725 2025-07-16 15:57:48,233 - INFO - log_σ² gradient: -0.362693 2025-07-16 15:57:48,307 - INFO - Optimizer step 28: log_σ²=0.350562, weight=0.704292 2025-07-16 15:58:09,242 - INFO - log_σ² gradient: -0.359566 2025-07-16 15:58:09,316 - INFO - Optimizer step 29: log_σ²=0.351175, weight=0.703861 2025-07-16 15:58:31,264 - INFO - log_σ² gradient: -0.358338 2025-07-16 15:58:31,334 - INFO - Optimizer step 30: log_σ²=0.351786, weight=0.703431 2025-07-16 15:58:51,767 - INFO - log_σ² gradient: -0.359892 2025-07-16 15:58:51,837 - INFO - Optimizer step 31: log_σ²=0.352395, weight=0.703002 2025-07-16 15:59:13,484 - INFO - log_σ² gradient: -0.358236 2025-07-16 15:59:13,554 - INFO - Optimizer step 32: log_σ²=0.353003, weight=0.702575 2025-07-16 15:59:35,685 - INFO - log_σ² gradient: -0.360441 2025-07-16 15:59:35,753 - INFO - Optimizer step 33: log_σ²=0.353609, weight=0.702149 2025-07-16 15:59:57,686 - INFO - log_σ² gradient: -0.360901 2025-07-16 15:59:57,760 - INFO - Optimizer step 34: log_σ²=0.354214, weight=0.701725 2025-07-16 16:00:20,301 - INFO - log_σ² gradient: -0.356568 2025-07-16 16:00:20,372 - INFO - Optimizer step 35: log_σ²=0.354817, weight=0.701302 2025-07-16 16:00:41,274 - INFO - log_σ² gradient: -0.359817 2025-07-16 16:00:41,347 - INFO - Optimizer step 36: log_σ²=0.355419, weight=0.700880 2025-07-16 16:01:03,782 - INFO - log_σ² gradient: -0.360729 2025-07-16 16:01:03,853 - INFO - Optimizer step 37: log_σ²=0.356020, weight=0.700459 2025-07-16 16:01:25,533 - INFO - log_σ² gradient: -0.361653 2025-07-16 16:01:25,603 - INFO - Optimizer step 38: log_σ²=0.356620, weight=0.700039 2025-07-16 16:01:48,324 - INFO - log_σ² gradient: -0.362957 2025-07-16 16:01:48,396 - INFO - Optimizer step 39: log_σ²=0.357218, weight=0.699620 2025-07-16 16:02:10,953 - INFO - log_σ² gradient: -0.361182 2025-07-16 16:02:11,032 - INFO - Optimizer step 40: log_σ²=0.357816, weight=0.699202 2025-07-16 16:02:34,554 - INFO - log_σ² gradient: -0.360963 2025-07-16 16:02:34,627 - INFO - Optimizer step 41: log_σ²=0.358413, weight=0.698785 2025-07-16 16:02:56,783 - INFO - log_σ² gradient: -0.366720 2025-07-16 16:02:56,857 - INFO - Optimizer step 42: log_σ²=0.359009, weight=0.698368 2025-07-16 16:03:19,090 - INFO - log_σ² gradient: -0.358015 2025-07-16 16:03:19,159 - INFO - Optimizer step 43: log_σ²=0.359604, weight=0.697953 2025-07-16 16:03:42,471 - INFO - log_σ² gradient: -0.357936 2025-07-16 16:03:42,547 - INFO - Optimizer step 44: log_σ²=0.360197, weight=0.697539 2025-07-16 16:04:05,237 - INFO - log_σ² gradient: -0.356620 2025-07-16 16:04:05,315 - INFO - Optimizer step 45: log_σ²=0.360788, weight=0.697127 2025-07-16 16:04:27,990 - INFO - log_σ² gradient: -0.363105 2025-07-16 16:04:28,060 - INFO - Optimizer step 46: log_σ²=0.361378, weight=0.696715 2025-07-16 16:04:50,828 - INFO - log_σ² gradient: -0.364549 2025-07-16 16:04:50,902 - INFO - Optimizer step 47: log_σ²=0.361968, weight=0.696305 2025-07-16 16:05:14,316 - INFO - log_σ² gradient: -0.365803 2025-07-16 16:05:14,395 - INFO - Optimizer step 48: log_σ²=0.362557, weight=0.695894 2025-07-16 16:05:24,229 - INFO - log_σ² gradient: -0.166973 2025-07-16 16:05:24,301 - INFO - Optimizer step 49: log_σ²=0.363114, weight=0.695507 2025-07-16 16:05:24,520 - INFO - Epoch 21: Total optimizer steps: 49 2025-07-16 16:08:29,679 - INFO - Validation metrics: 2025-07-16 16:08:29,679 - INFO - Loss: 0.4529 2025-07-16 16:08:29,679 - INFO - BCE Loss: 0.3556 2025-07-16 16:08:29,679 - INFO - Weighted BCE Loss: 0.2473 2025-07-16 16:08:29,679 - INFO - Average similarity: 0.7427 2025-07-16 16:08:29,679 - INFO - Median similarity: 0.7595 2025-07-16 16:08:29,679 - INFO - Clean sample similarity: 0.7427 2025-07-16 16:08:29,679 - INFO - Corrupted sample similarity: 0.3698 2025-07-16 16:08:29,679 - INFO - Similarity gap (clean - corrupt): 0.3729 2025-07-16 16:08:29,874 - INFO - Epoch 21/30 - Train Loss: 0.4917, Val Loss: 0.4529, Val BCE: 0.3556, Val wBCE: 0.2473, Clean Sim: 0.7427, Corrupt Sim: 0.3698, Gap: 0.3729, Time: 1268.24s 2025-07-16 16:08:29,874 - INFO - New best validation loss: 0.4529 2025-07-16 16:09:10,726 - INFO - log_σ² gradient: -0.358029 2025-07-16 16:09:10,799 - INFO - Optimizer step 1: log_σ²=0.363672, weight=0.695119 2025-07-16 16:09:33,137 - INFO - log_σ² gradient: -0.359652 2025-07-16 16:09:33,208 - INFO - Optimizer step 2: log_σ²=0.364232, weight=0.694730 2025-07-16 16:09:55,140 - INFO - log_σ² gradient: -0.356041 2025-07-16 16:09:55,214 - INFO - Optimizer step 3: log_σ²=0.364792, weight=0.694341 2025-07-16 16:10:17,820 - INFO - log_σ² gradient: -0.355719 2025-07-16 16:10:17,898 - INFO - Optimizer step 4: log_σ²=0.365353, weight=0.693952 2025-07-16 16:10:39,574 - INFO - log_σ² gradient: -0.356864 2025-07-16 16:10:39,649 - INFO - Optimizer step 5: log_σ²=0.365914, weight=0.693562 2025-07-16 16:11:03,324 - INFO - log_σ² gradient: -0.355114 2025-07-16 16:11:03,396 - INFO - Optimizer step 6: log_σ²=0.366475, weight=0.693173 2025-07-16 16:11:25,285 - INFO - log_σ² gradient: -0.354255 2025-07-16 16:11:25,355 - INFO - Optimizer step 7: log_σ²=0.367036, weight=0.692785 2025-07-16 16:11:46,286 - INFO - log_σ² gradient: -0.359958 2025-07-16 16:11:46,356 - INFO - Optimizer step 8: log_σ²=0.367598, weight=0.692396 2025-07-16 16:12:10,966 - INFO - log_σ² gradient: -0.359346 2025-07-16 16:12:11,031 - INFO - Optimizer step 9: log_σ²=0.368159, weight=0.692007 2025-07-16 16:12:34,059 - INFO - log_σ² gradient: -0.355218 2025-07-16 16:12:34,132 - INFO - Optimizer step 10: log_σ²=0.368721, weight=0.691619 2025-07-16 16:12:56,425 - INFO - log_σ² gradient: -0.356165 2025-07-16 16:12:56,498 - INFO - Optimizer step 11: log_σ²=0.369281, weight=0.691231 2025-07-16 16:13:20,466 - INFO - log_σ² gradient: -0.357272 2025-07-16 16:13:20,536 - INFO - Optimizer step 12: log_σ²=0.369842, weight=0.690844 2025-07-16 16:13:44,639 - INFO - log_σ² gradient: -0.358311 2025-07-16 16:13:44,720 - INFO - Optimizer step 13: log_σ²=0.370402, weight=0.690457 2025-07-16 16:14:07,491 - INFO - log_σ² gradient: -0.353642 2025-07-16 16:14:07,560 - INFO - Optimizer step 14: log_σ²=0.370961, weight=0.690071 2025-07-16 16:14:30,689 - INFO - log_σ² gradient: -0.363473 2025-07-16 16:14:30,761 - INFO - Optimizer step 15: log_σ²=0.371521, weight=0.689685 2025-07-16 16:14:52,645 - INFO - log_σ² gradient: -0.355558 2025-07-16 16:14:52,715 - INFO - Optimizer step 16: log_σ²=0.372079, weight=0.689300 2025-07-16 16:15:15,505 - INFO - log_σ² gradient: -0.359987 2025-07-16 16:15:15,580 - INFO - Optimizer step 17: log_σ²=0.372638, weight=0.688915 2025-07-16 16:15:37,462 - INFO - log_σ² gradient: -0.354294 2025-07-16 16:15:37,534 - INFO - Optimizer step 18: log_σ²=0.373195, weight=0.688531 2025-07-16 16:15:59,785 - INFO - log_σ² gradient: -0.348993 2025-07-16 16:15:59,855 - INFO - Optimizer step 19: log_σ²=0.373750, weight=0.688149 2025-07-16 16:16:22,813 - INFO - log_σ² gradient: -0.354094 2025-07-16 16:16:22,891 - INFO - Optimizer step 20: log_σ²=0.374303, weight=0.687768 2025-07-16 16:16:47,203 - INFO - log_σ² gradient: -0.357942 2025-07-16 16:16:47,275 - INFO - Optimizer step 21: log_σ²=0.374857, weight=0.687388 2025-07-16 16:17:11,793 - INFO - log_σ² gradient: -0.351096 2025-07-16 16:17:11,875 - INFO - Optimizer step 22: log_σ²=0.375408, weight=0.687009 2025-07-16 16:17:33,765 - INFO - log_σ² gradient: -0.356727 2025-07-16 16:17:33,838 - INFO - Optimizer step 23: log_σ²=0.375959, weight=0.686631 2025-07-16 16:17:56,501 - INFO - log_σ² gradient: -0.352866 2025-07-16 16:17:56,575 - INFO - Optimizer step 24: log_σ²=0.376508, weight=0.686253 2025-07-16 16:18:20,224 - INFO - log_σ² gradient: -0.353027 2025-07-16 16:18:20,297 - INFO - Optimizer step 25: log_σ²=0.377056, weight=0.685878 2025-07-16 16:18:43,252 - INFO - log_σ² gradient: -0.351404 2025-07-16 16:18:43,332 - INFO - Optimizer step 26: log_σ²=0.377603, weight=0.685503 2025-07-16 16:19:06,803 - INFO - log_σ² gradient: -0.350224 2025-07-16 16:19:06,878 - INFO - Optimizer step 27: log_σ²=0.378147, weight=0.685130 2025-07-16 16:19:28,654 - INFO - log_σ² gradient: -0.345922 2025-07-16 16:19:28,724 - INFO - Optimizer step 28: log_σ²=0.378690, weight=0.684758 2025-07-16 16:19:49,945 - INFO - log_σ² gradient: -0.360479 2025-07-16 16:19:50,014 - INFO - Optimizer step 29: log_σ²=0.379232, weight=0.684387 2025-07-16 16:20:12,065 - INFO - log_σ² gradient: -0.345609 2025-07-16 16:20:12,138 - INFO - Optimizer step 30: log_σ²=0.379772, weight=0.684017 2025-07-16 16:20:34,473 - INFO - log_σ² gradient: -0.351645 2025-07-16 16:20:34,543 - INFO - Optimizer step 31: log_σ²=0.380311, weight=0.683649 2025-07-16 16:20:55,755 - INFO - log_σ² gradient: -0.360886 2025-07-16 16:20:55,825 - INFO - Optimizer step 32: log_σ²=0.380849, weight=0.683281 2025-07-16 16:21:18,821 - INFO - log_σ² gradient: -0.355037 2025-07-16 16:21:18,895 - INFO - Optimizer step 33: log_σ²=0.381387, weight=0.682913 2025-07-16 16:21:41,115 - INFO - log_σ² gradient: -0.347941 2025-07-16 16:21:41,185 - INFO - Optimizer step 34: log_σ²=0.381923, weight=0.682548 2025-07-16 16:22:03,047 - INFO - log_σ² gradient: -0.348045 2025-07-16 16:22:03,120 - INFO - Optimizer step 35: log_σ²=0.382457, weight=0.682183 2025-07-16 16:22:23,866 - INFO - log_σ² gradient: -0.348273 2025-07-16 16:22:23,937 - INFO - Optimizer step 36: log_σ²=0.382989, weight=0.681820 2025-07-16 16:22:45,325 - INFO - log_σ² gradient: -0.347505 2025-07-16 16:22:45,400 - INFO - Optimizer step 37: log_σ²=0.383519, weight=0.681459 2025-07-16 16:23:08,505 - INFO - log_σ² gradient: -0.354013 2025-07-16 16:23:08,572 - INFO - Optimizer step 38: log_σ²=0.384049, weight=0.681098 2025-07-16 16:23:28,471 - INFO - log_σ² gradient: -0.354552 2025-07-16 16:23:28,546 - INFO - Optimizer step 39: log_σ²=0.384578, weight=0.680738 2025-07-16 16:23:51,588 - INFO - log_σ² gradient: -0.344566 2025-07-16 16:23:51,666 - INFO - Optimizer step 40: log_σ²=0.385104, weight=0.680380 2025-07-16 16:24:13,829 - INFO - log_σ² gradient: -0.343955 2025-07-16 16:24:13,899 - INFO - Optimizer step 41: log_σ²=0.385628, weight=0.680023 2025-07-16 16:24:37,872 - INFO - log_σ² gradient: -0.352759 2025-07-16 16:24:37,950 - INFO - Optimizer step 42: log_σ²=0.386151, weight=0.679668 2025-07-16 16:25:00,857 - INFO - log_σ² gradient: -0.348990 2025-07-16 16:25:00,931 - INFO - Optimizer step 43: log_σ²=0.386673, weight=0.679313 2025-07-16 16:25:22,944 - INFO - log_σ² gradient: -0.350014 2025-07-16 16:25:23,023 - INFO - Optimizer step 44: log_σ²=0.387194, weight=0.678959 2025-07-16 16:25:44,570 - INFO - log_σ² gradient: -0.347484 2025-07-16 16:25:44,641 - INFO - Optimizer step 45: log_σ²=0.387713, weight=0.678607 2025-07-16 16:26:05,766 - INFO - log_σ² gradient: -0.347000 2025-07-16 16:26:05,834 - INFO - Optimizer step 46: log_σ²=0.388230, weight=0.678256 2025-07-16 16:26:28,958 - INFO - log_σ² gradient: -0.350605 2025-07-16 16:26:29,029 - INFO - Optimizer step 47: log_σ²=0.388747, weight=0.677906 2025-07-16 16:26:51,739 - INFO - log_σ² gradient: -0.352077 2025-07-16 16:26:51,812 - INFO - Optimizer step 48: log_σ²=0.389262, weight=0.677557 2025-07-16 16:27:01,776 - INFO - log_σ² gradient: -0.157593 2025-07-16 16:27:01,848 - INFO - Optimizer step 49: log_σ²=0.389748, weight=0.677227 2025-07-16 16:27:02,061 - INFO - Epoch 22: Total optimizer steps: 49 2025-07-16 16:30:01,098 - INFO - Validation metrics: 2025-07-16 16:30:01,098 - INFO - Loss: 0.4402 2025-07-16 16:30:01,098 - INFO - BCE Loss: 0.3449 2025-07-16 16:30:01,098 - INFO - Weighted BCE Loss: 0.2336 2025-07-16 16:30:01,098 - INFO - Average similarity: 0.7340 2025-07-16 16:30:01,098 - INFO - Median similarity: 0.7494 2025-07-16 16:30:01,098 - INFO - Clean sample similarity: 0.7340 2025-07-16 16:30:01,098 - INFO - Corrupted sample similarity: 0.3449 2025-07-16 16:30:01,099 - INFO - Similarity gap (clean - corrupt): 0.3891 2025-07-16 16:30:01,275 - INFO - Epoch 22/30 - Train Loss: 0.4792, Val Loss: 0.4402, Val BCE: 0.3449, Val wBCE: 0.2336, Clean Sim: 0.7340, Corrupt Sim: 0.3449, Gap: 0.3891, Time: 1287.79s 2025-07-16 16:30:01,275 - INFO - New best validation loss: 0.4402 2025-07-16 16:32:49,528 - INFO - Epoch 22 Validation Alignment: Pos=0.211, Neg=0.102, Gap=0.109 2025-07-16 16:33:22,884 - INFO - log_σ² gradient: -0.347360 2025-07-16 16:33:22,956 - INFO - Optimizer step 1: log_σ²=0.390235, weight=0.676897 2025-07-16 16:33:44,349 - INFO - log_σ² gradient: -0.351831 2025-07-16 16:33:44,422 - INFO - Optimizer step 2: log_σ²=0.390725, weight=0.676566 2025-07-16 16:34:06,572 - INFO - log_σ² gradient: -0.340781 2025-07-16 16:34:06,642 - INFO - Optimizer step 3: log_σ²=0.391214, weight=0.676236 2025-07-16 16:34:28,841 - INFO - log_σ² gradient: -0.342276 2025-07-16 16:34:28,913 - INFO - Optimizer step 4: log_σ²=0.391702, weight=0.675905 2025-07-16 16:34:50,131 - INFO - log_σ² gradient: -0.354206 2025-07-16 16:34:50,201 - INFO - Optimizer step 5: log_σ²=0.392193, weight=0.675574 2025-07-16 16:35:13,634 - INFO - log_σ² gradient: -0.344434 2025-07-16 16:35:13,712 - INFO - Optimizer step 6: log_σ²=0.392683, weight=0.675243 2025-07-16 16:35:36,623 - INFO - log_σ² gradient: -0.342090 2025-07-16 16:35:36,697 - INFO - Optimizer step 7: log_σ²=0.393173, weight=0.674912 2025-07-16 16:35:59,058 - INFO - log_σ² gradient: -0.347094 2025-07-16 16:35:59,133 - INFO - Optimizer step 8: log_σ²=0.393663, weight=0.674582 2025-07-16 16:36:22,099 - INFO - log_σ² gradient: -0.347188 2025-07-16 16:36:22,174 - INFO - Optimizer step 9: log_σ²=0.394152, weight=0.674251 2025-07-16 16:36:42,671 - INFO - log_σ² gradient: -0.355543 2025-07-16 16:36:42,749 - INFO - Optimizer step 10: log_σ²=0.394643, weight=0.673920 2025-07-16 16:37:05,520 - INFO - log_σ² gradient: -0.345551 2025-07-16 16:37:05,592 - INFO - Optimizer step 11: log_σ²=0.395134, weight=0.673590 2025-07-16 16:37:28,130 - INFO - log_σ² gradient: -0.349486 2025-07-16 16:37:28,200 - INFO - Optimizer step 12: log_σ²=0.395624, weight=0.673260 2025-07-16 16:37:50,651 - INFO - log_σ² gradient: -0.342726 2025-07-16 16:37:50,721 - INFO - Optimizer step 13: log_σ²=0.396113, weight=0.672931 2025-07-16 16:38:13,311 - INFO - log_σ² gradient: -0.347025 2025-07-16 16:38:13,380 - INFO - Optimizer step 14: log_σ²=0.396601, weight=0.672602 2025-07-16 16:38:36,173 - INFO - log_σ² gradient: -0.349486 2025-07-16 16:38:36,241 - INFO - Optimizer step 15: log_σ²=0.397089, weight=0.672274 2025-07-16 16:38:57,985 - INFO - log_σ² gradient: -0.340342 2025-07-16 16:38:58,056 - INFO - Optimizer step 16: log_σ²=0.397576, weight=0.671947 2025-07-16 16:39:20,247 - INFO - log_σ² gradient: -0.342327 2025-07-16 16:39:20,316 - INFO - Optimizer step 17: log_σ²=0.398061, weight=0.671621 2025-07-16 16:39:41,370 - INFO - log_σ² gradient: -0.331549 2025-07-16 16:39:41,453 - INFO - Optimizer step 18: log_σ²=0.398543, weight=0.671297 2025-07-16 16:40:04,282 - INFO - log_σ² gradient: -0.344398 2025-07-16 16:40:04,354 - INFO - Optimizer step 19: log_σ²=0.399025, weight=0.670974 2025-07-16 16:40:27,006 - INFO - log_σ² gradient: -0.342600 2025-07-16 16:40:27,080 - INFO - Optimizer step 20: log_σ²=0.399505, weight=0.670652 2025-07-16 16:40:50,279 - INFO - log_σ² gradient: -0.344281 2025-07-16 16:40:50,360 - INFO - Optimizer step 21: log_σ²=0.399985, weight=0.670330 2025-07-16 16:41:11,600 - INFO - log_σ² gradient: -0.343988 2025-07-16 16:41:11,670 - INFO - Optimizer step 22: log_σ²=0.400463, weight=0.670009 2025-07-16 16:41:35,348 - INFO - log_σ² gradient: -0.341261 2025-07-16 16:41:35,425 - INFO - Optimizer step 23: log_σ²=0.400941, weight=0.669690 2025-07-16 16:41:58,417 - INFO - log_σ² gradient: -0.338275 2025-07-16 16:41:58,489 - INFO - Optimizer step 24: log_σ²=0.401416, weight=0.669372 2025-07-16 16:42:20,519 - INFO - log_σ² gradient: -0.335968 2025-07-16 16:42:20,593 - INFO - Optimizer step 25: log_σ²=0.401890, weight=0.669055 2025-07-16 16:42:42,011 - INFO - log_σ² gradient: -0.341260 2025-07-16 16:42:42,088 - INFO - Optimizer step 26: log_σ²=0.402362, weight=0.668739 2025-07-16 16:43:05,646 - INFO - log_σ² gradient: -0.342708 2025-07-16 16:43:05,716 - INFO - Optimizer step 27: log_σ²=0.402833, weight=0.668424 2025-07-16 16:43:28,284 - INFO - log_σ² gradient: -0.349395 2025-07-16 16:43:28,362 - INFO - Optimizer step 28: log_σ²=0.403304, weight=0.668109 2025-07-16 16:43:49,968 - INFO - log_σ² gradient: -0.343882 2025-07-16 16:43:50,042 - INFO - Optimizer step 29: log_σ²=0.403775, weight=0.667795 2025-07-16 16:44:12,207 - INFO - log_σ² gradient: -0.344435 2025-07-16 16:44:12,275 - INFO - Optimizer step 30: log_σ²=0.404244, weight=0.667481 2025-07-16 16:44:33,414 - INFO - log_σ² gradient: -0.340989 2025-07-16 16:44:33,486 - INFO - Optimizer step 31: log_σ²=0.404712, weight=0.667169 2025-07-16 16:44:54,984 - INFO - log_σ² gradient: -0.338646 2025-07-16 16:44:55,057 - INFO - Optimizer step 32: log_σ²=0.405178, weight=0.666858 2025-07-16 16:45:16,971 - INFO - log_σ² gradient: -0.345462 2025-07-16 16:45:17,045 - INFO - Optimizer step 33: log_σ²=0.405644, weight=0.666548 2025-07-16 16:45:39,549 - INFO - log_σ² gradient: -0.342705 2025-07-16 16:45:39,619 - INFO - Optimizer step 34: log_σ²=0.406108, weight=0.666238 2025-07-16 16:46:02,075 - INFO - log_σ² gradient: -0.341002 2025-07-16 16:46:02,144 - INFO - Optimizer step 35: log_σ²=0.406571, weight=0.665930 2025-07-16 16:46:23,448 - INFO - log_σ² gradient: -0.345071 2025-07-16 16:46:23,518 - INFO - Optimizer step 36: log_σ²=0.407034, weight=0.665622 2025-07-16 16:46:46,370 - INFO - log_σ² gradient: -0.339269 2025-07-16 16:46:46,443 - INFO - Optimizer step 37: log_σ²=0.407494, weight=0.665315 2025-07-16 16:47:09,294 - INFO - log_σ² gradient: -0.340286 2025-07-16 16:47:09,366 - INFO - Optimizer step 38: log_σ²=0.407954, weight=0.665010 2025-07-16 16:47:30,600 - INFO - log_σ² gradient: -0.344287 2025-07-16 16:47:30,668 - INFO - Optimizer step 39: log_σ²=0.408412, weight=0.664705 2025-07-16 16:47:52,582 - INFO - log_σ² gradient: -0.344399 2025-07-16 16:47:52,653 - INFO - Optimizer step 40: log_σ²=0.408869, weight=0.664401 2025-07-16 16:48:15,379 - INFO - log_σ² gradient: -0.339675 2025-07-16 16:48:15,454 - INFO - Optimizer step 41: log_σ²=0.409325, weight=0.664098 2025-07-16 16:48:38,495 - INFO - log_σ² gradient: -0.338745 2025-07-16 16:48:38,566 - INFO - Optimizer step 42: log_σ²=0.409780, weight=0.663796 2025-07-16 16:49:00,526 - INFO - log_σ² gradient: -0.338599 2025-07-16 16:49:00,596 - INFO - Optimizer step 43: log_σ²=0.410233, weight=0.663496 2025-07-16 16:49:23,511 - INFO - log_σ² gradient: -0.338416 2025-07-16 16:49:23,586 - INFO - Optimizer step 44: log_σ²=0.410684, weight=0.663197 2025-07-16 16:49:47,011 - INFO - log_σ² gradient: -0.339521 2025-07-16 16:49:47,081 - INFO - Optimizer step 45: log_σ²=0.411133, weight=0.662898 2025-07-16 16:50:09,136 - INFO - log_σ² gradient: -0.340790 2025-07-16 16:50:09,211 - INFO - Optimizer step 46: log_σ²=0.411582, weight=0.662601 2025-07-16 16:50:32,371 - INFO - log_σ² gradient: -0.337163 2025-07-16 16:50:32,449 - INFO - Optimizer step 47: log_σ²=0.412029, weight=0.662305 2025-07-16 16:50:55,178 - INFO - log_σ² gradient: -0.342789 2025-07-16 16:50:55,261 - INFO - Optimizer step 48: log_σ²=0.412475, weight=0.662010 2025-07-16 16:51:05,307 - INFO - log_σ² gradient: -0.162002 2025-07-16 16:51:05,374 - INFO - Optimizer step 49: log_σ²=0.412897, weight=0.661731 2025-07-16 16:51:05,564 - INFO - Epoch 23: Total optimizer steps: 49 2025-07-16 16:54:03,776 - INFO - Validation metrics: 2025-07-16 16:54:03,776 - INFO - Loss: 0.4363 2025-07-16 16:54:03,776 - INFO - BCE Loss: 0.3396 2025-07-16 16:54:03,776 - INFO - Weighted BCE Loss: 0.2247 2025-07-16 16:54:03,776 - INFO - Average similarity: 0.7395 2025-07-16 16:54:03,776 - INFO - Median similarity: 0.7608 2025-07-16 16:54:03,776 - INFO - Clean sample similarity: 0.7395 2025-07-16 16:54:03,776 - INFO - Corrupted sample similarity: 0.3474 2025-07-16 16:54:03,776 - INFO - Similarity gap (clean - corrupt): 0.3921 2025-07-16 16:54:03,967 - INFO - Epoch 23/30 - Train Loss: 0.4708, Val Loss: 0.4363, Val BCE: 0.3396, Val wBCE: 0.2247, Clean Sim: 0.7395, Corrupt Sim: 0.3474, Gap: 0.3921, Time: 1274.44s 2025-07-16 16:54:03,967 - INFO - New best validation loss: 0.4363 2025-07-16 16:54:07,323 - INFO - New best similarity gap: 0.3921 2025-07-16 16:54:42,286 - INFO - log_σ² gradient: -0.339154 2025-07-16 16:54:42,361 - INFO - Optimizer step 1: log_σ²=0.413319, weight=0.661451 2025-07-16 16:55:04,480 - INFO - log_σ² gradient: -0.341082 2025-07-16 16:55:04,557 - INFO - Optimizer step 2: log_σ²=0.413743, weight=0.661171 2025-07-16 16:55:25,894 - INFO - log_σ² gradient: -0.342317 2025-07-16 16:55:25,967 - INFO - Optimizer step 3: log_σ²=0.414168, weight=0.660890 2025-07-16 16:55:48,271 - INFO - log_σ² gradient: -0.344903 2025-07-16 16:55:48,346 - INFO - Optimizer step 4: log_σ²=0.414593, weight=0.660609 2025-07-16 16:56:11,194 - INFO - log_σ² gradient: -0.333246 2025-07-16 16:56:11,273 - INFO - Optimizer step 5: log_σ²=0.415018, weight=0.660328 2025-07-16 16:56:33,791 - INFO - log_σ² gradient: -0.338313 2025-07-16 16:56:33,859 - INFO - Optimizer step 6: log_σ²=0.415443, weight=0.660048 2025-07-16 16:56:55,522 - INFO - log_σ² gradient: -0.338846 2025-07-16 16:56:55,592 - INFO - Optimizer step 7: log_σ²=0.415868, weight=0.659767 2025-07-16 16:57:18,765 - INFO - log_σ² gradient: -0.335137 2025-07-16 16:57:18,840 - INFO - Optimizer step 8: log_σ²=0.416293, weight=0.659487 2025-07-16 16:57:40,656 - INFO - log_σ² gradient: -0.335406 2025-07-16 16:57:40,726 - INFO - Optimizer step 9: log_σ²=0.416716, weight=0.659208 2025-07-16 16:58:03,013 - INFO - log_σ² gradient: -0.340824 2025-07-16 16:58:03,085 - INFO - Optimizer step 10: log_σ²=0.417140, weight=0.658929 2025-07-16 16:58:23,564 - INFO - log_σ² gradient: -0.334353 2025-07-16 16:58:23,638 - INFO - Optimizer step 11: log_σ²=0.417562, weight=0.658650 2025-07-16 16:58:46,301 - INFO - log_σ² gradient: -0.336411 2025-07-16 16:58:46,372 - INFO - Optimizer step 12: log_σ²=0.417984, weight=0.658373 2025-07-16 16:59:08,304 - INFO - log_σ² gradient: -0.335266 2025-07-16 16:59:08,370 - INFO - Optimizer step 13: log_σ²=0.418405, weight=0.658096 2025-07-16 16:59:28,997 - INFO - log_σ² gradient: -0.332385 2025-07-16 16:59:29,066 - INFO - Optimizer step 14: log_σ²=0.418825, weight=0.657820 2025-07-16 16:59:51,523 - INFO - log_σ² gradient: -0.336175 2025-07-16 16:59:51,589 - INFO - Optimizer step 15: log_σ²=0.419243, weight=0.657544 2025-07-16 17:00:16,125 - INFO - log_σ² gradient: -0.331124 2025-07-16 17:00:16,202 - INFO - Optimizer step 16: log_σ²=0.419661, weight=0.657270 2025-07-16 17:00:38,612 - INFO - log_σ² gradient: -0.329713 2025-07-16 17:00:38,679 - INFO - Optimizer step 17: log_σ²=0.420076, weight=0.656997 2025-07-16 17:01:01,135 - INFO - log_σ² gradient: -0.337733 2025-07-16 17:01:01,205 - INFO - Optimizer step 18: log_σ²=0.420491, weight=0.656724 2025-07-16 17:01:22,839 - INFO - log_σ² gradient: -0.334289 2025-07-16 17:01:22,907 - INFO - Optimizer step 19: log_σ²=0.420906, weight=0.656452 2025-07-16 17:01:45,124 - INFO - log_σ² gradient: -0.335240 2025-07-16 17:01:45,200 - INFO - Optimizer step 20: log_σ²=0.421319, weight=0.656181 2025-07-16 17:02:08,940 - INFO - log_σ² gradient: -0.341250 2025-07-16 17:02:09,014 - INFO - Optimizer step 21: log_σ²=0.421731, weight=0.655910 2025-07-16 17:02:30,912 - INFO - log_σ² gradient: -0.342555 2025-07-16 17:02:30,986 - INFO - Optimizer step 22: log_σ²=0.422144, weight=0.655640 2025-07-16 17:02:52,987 - INFO - log_σ² gradient: -0.340744 2025-07-16 17:02:53,058 - INFO - Optimizer step 23: log_σ²=0.422556, weight=0.655369 2025-07-16 17:03:16,017 - INFO - log_σ² gradient: -0.337784 2025-07-16 17:03:16,087 - INFO - Optimizer step 24: log_σ²=0.422967, weight=0.655100 2025-07-16 17:03:38,236 - INFO - log_σ² gradient: -0.336619 2025-07-16 17:03:38,311 - INFO - Optimizer step 25: log_σ²=0.423378, weight=0.654831 2025-07-16 17:03:59,312 - INFO - log_σ² gradient: -0.337319 2025-07-16 17:03:59,384 - INFO - Optimizer step 26: log_σ²=0.423787, weight=0.654564 2025-07-16 17:04:22,429 - INFO - log_σ² gradient: -0.325974 2025-07-16 17:04:22,499 - INFO - Optimizer step 27: log_σ²=0.424193, weight=0.654297 2025-07-16 17:04:44,104 - INFO - log_σ² gradient: -0.336028 2025-07-16 17:04:44,177 - INFO - Optimizer step 28: log_σ²=0.424599, weight=0.654032 2025-07-16 17:05:06,267 - INFO - log_σ² gradient: -0.332989 2025-07-16 17:05:06,339 - INFO - Optimizer step 29: log_σ²=0.425003, weight=0.653768 2025-07-16 17:05:28,522 - INFO - log_σ² gradient: -0.329191 2025-07-16 17:05:28,595 - INFO - Optimizer step 30: log_σ²=0.425405, weight=0.653505 2025-07-16 17:05:50,638 - INFO - log_σ² gradient: -0.328626 2025-07-16 17:05:50,710 - INFO - Optimizer step 31: log_σ²=0.425806, weight=0.653243 2025-07-16 17:06:12,591 - INFO - log_σ² gradient: -0.330125 2025-07-16 17:06:12,660 - INFO - Optimizer step 32: log_σ²=0.426205, weight=0.652983 2025-07-16 17:06:34,062 - INFO - log_σ² gradient: -0.338711 2025-07-16 17:06:34,131 - INFO - Optimizer step 33: log_σ²=0.426603, weight=0.652723 2025-07-16 17:06:55,278 - INFO - log_σ² gradient: -0.337665 2025-07-16 17:06:55,348 - INFO - Optimizer step 34: log_σ²=0.427001, weight=0.652463 2025-07-16 17:07:18,460 - INFO - log_σ² gradient: -0.331918 2025-07-16 17:07:18,535 - INFO - Optimizer step 35: log_σ²=0.427397, weight=0.652205 2025-07-16 17:07:40,435 - INFO - log_σ² gradient: -0.333758 2025-07-16 17:07:40,502 - INFO - Optimizer step 36: log_σ²=0.427792, weight=0.651947 2025-07-16 17:08:02,105 - INFO - log_σ² gradient: -0.331026 2025-07-16 17:08:02,176 - INFO - Optimizer step 37: log_σ²=0.428185, weight=0.651691 2025-07-16 17:08:25,371 - INFO - log_σ² gradient: -0.329910 2025-07-16 17:08:25,446 - INFO - Optimizer step 38: log_σ²=0.428577, weight=0.651436 2025-07-16 17:08:48,045 - INFO - log_σ² gradient: -0.329356 2025-07-16 17:08:48,111 - INFO - Optimizer step 39: log_σ²=0.428967, weight=0.651181 2025-07-16 17:09:10,851 - INFO - log_σ² gradient: -0.332240 2025-07-16 17:09:10,925 - INFO - Optimizer step 40: log_σ²=0.429356, weight=0.650928 2025-07-16 17:09:32,080 - INFO - log_σ² gradient: -0.331635 2025-07-16 17:09:32,151 - INFO - Optimizer step 41: log_σ²=0.429743, weight=0.650676 2025-07-16 17:09:54,292 - INFO - log_σ² gradient: -0.323897 2025-07-16 17:09:54,362 - INFO - Optimizer step 42: log_σ²=0.430129, weight=0.650425 2025-07-16 17:10:16,748 - INFO - log_σ² gradient: -0.336283 2025-07-16 17:10:16,818 - INFO - Optimizer step 43: log_σ²=0.430513, weight=0.650175 2025-07-16 17:10:37,335 - INFO - log_σ² gradient: -0.333189 2025-07-16 17:10:37,405 - INFO - Optimizer step 44: log_σ²=0.430897, weight=0.649926 2025-07-16 17:10:59,958 - INFO - log_σ² gradient: -0.337041 2025-07-16 17:11:00,028 - INFO - Optimizer step 45: log_σ²=0.431280, weight=0.649677 2025-07-16 17:11:21,560 - INFO - log_σ² gradient: -0.329021 2025-07-16 17:11:21,638 - INFO - Optimizer step 46: log_σ²=0.431662, weight=0.649429 2025-07-16 17:11:43,719 - INFO - log_σ² gradient: -0.334356 2025-07-16 17:11:43,797 - INFO - Optimizer step 47: log_σ²=0.432042, weight=0.649182 2025-07-16 17:12:06,124 - INFO - log_σ² gradient: -0.335154 2025-07-16 17:12:06,197 - INFO - Optimizer step 48: log_σ²=0.432422, weight=0.648936 2025-07-16 17:12:16,444 - INFO - log_σ² gradient: -0.157238 2025-07-16 17:12:16,514 - INFO - Optimizer step 49: log_σ²=0.432780, weight=0.648703 2025-07-16 17:12:16,686 - INFO - Epoch 24: Total optimizer steps: 49 2025-07-16 17:15:15,894 - INFO - Validation metrics: 2025-07-16 17:15:15,894 - INFO - Loss: 0.4235 2025-07-16 17:15:15,894 - INFO - BCE Loss: 0.3325 2025-07-16 17:15:15,894 - INFO - Weighted BCE Loss: 0.2157 2025-07-16 17:15:15,894 - INFO - Average similarity: 0.7402 2025-07-16 17:15:15,894 - INFO - Median similarity: 0.7518 2025-07-16 17:15:15,894 - INFO - Clean sample similarity: 0.7402 2025-07-16 17:15:15,894 - INFO - Corrupted sample similarity: 0.3603 2025-07-16 17:15:15,894 - INFO - Similarity gap (clean - corrupt): 0.3799 2025-07-16 17:15:16,095 - INFO - Epoch 24/30 - Train Loss: 0.4551, Val Loss: 0.4235, Val BCE: 0.3325, Val wBCE: 0.2157, Clean Sim: 0.7402, Corrupt Sim: 0.3603, Gap: 0.3799, Time: 1265.39s 2025-07-16 17:15:16,096 - INFO - New best validation loss: 0.4235 2025-07-16 17:18:04,405 - INFO - Epoch 24 Validation Alignment: Pos=0.218, Neg=0.108, Gap=0.110 2025-07-16 17:18:38,283 - INFO - log_σ² gradient: -0.333491 2025-07-16 17:18:38,355 - INFO - Optimizer step 1: log_σ²=0.433139, weight=0.648470 2025-07-16 17:19:00,361 - INFO - log_σ² gradient: -0.328566 2025-07-16 17:19:00,438 - INFO - Optimizer step 2: log_σ²=0.433499, weight=0.648237 2025-07-16 17:19:22,785 - INFO - log_σ² gradient: -0.335258 2025-07-16 17:19:22,855 - INFO - Optimizer step 3: log_σ²=0.433859, weight=0.648003 2025-07-16 17:19:44,309 - INFO - log_σ² gradient: -0.326384 2025-07-16 17:19:44,384 - INFO - Optimizer step 4: log_σ²=0.434219, weight=0.647770 2025-07-16 17:20:06,735 - INFO - log_σ² gradient: -0.325717 2025-07-16 17:20:06,808 - INFO - Optimizer step 5: log_σ²=0.434579, weight=0.647537 2025-07-16 17:20:29,859 - INFO - log_σ² gradient: -0.332261 2025-07-16 17:20:29,932 - INFO - Optimizer step 6: log_σ²=0.434938, weight=0.647305 2025-07-16 17:20:51,549 - INFO - log_σ² gradient: -0.330605 2025-07-16 17:20:51,627 - INFO - Optimizer step 7: log_σ²=0.435298, weight=0.647072 2025-07-16 17:21:13,062 - INFO - log_σ² gradient: -0.332153 2025-07-16 17:21:13,131 - INFO - Optimizer step 8: log_σ²=0.435657, weight=0.646840 2025-07-16 17:21:33,750 - INFO - log_σ² gradient: -0.330690 2025-07-16 17:21:33,827 - INFO - Optimizer step 9: log_σ²=0.436016, weight=0.646608 2025-07-16 17:21:56,665 - INFO - log_σ² gradient: -0.331935 2025-07-16 17:21:56,739 - INFO - Optimizer step 10: log_σ²=0.436374, weight=0.646376 2025-07-16 17:22:18,434 - INFO - log_σ² gradient: -0.325985 2025-07-16 17:22:18,505 - INFO - Optimizer step 11: log_σ²=0.436732, weight=0.646145 2025-07-16 17:22:39,607 - INFO - log_σ² gradient: -0.323328 2025-07-16 17:22:39,685 - INFO - Optimizer step 12: log_σ²=0.437088, weight=0.645915 2025-07-16 17:23:01,372 - INFO - log_σ² gradient: -0.335942 2025-07-16 17:23:01,442 - INFO - Optimizer step 13: log_σ²=0.437444, weight=0.645685 2025-07-16 17:23:23,246 - INFO - log_σ² gradient: -0.323670 2025-07-16 17:23:23,321 - INFO - Optimizer step 14: log_σ²=0.437798, weight=0.645456 2025-07-16 17:23:44,722 - INFO - log_σ² gradient: -0.332549 2025-07-16 17:23:44,796 - INFO - Optimizer step 15: log_σ²=0.438152, weight=0.645227 2025-07-16 17:24:06,855 - INFO - log_σ² gradient: -0.326052 2025-07-16 17:24:06,924 - INFO - Optimizer step 16: log_σ²=0.438505, weight=0.645000 2025-07-16 17:24:29,831 - INFO - log_σ² gradient: -0.326010 2025-07-16 17:24:29,903 - INFO - Optimizer step 17: log_σ²=0.438857, weight=0.644773 2025-07-16 17:24:52,266 - INFO - log_σ² gradient: -0.325190 2025-07-16 17:24:52,340 - INFO - Optimizer step 18: log_σ²=0.439208, weight=0.644547 2025-07-16 17:25:14,234 - INFO - log_σ² gradient: -0.332781 2025-07-16 17:25:14,313 - INFO - Optimizer step 19: log_σ²=0.439557, weight=0.644322 2025-07-16 17:25:37,566 - INFO - log_σ² gradient: -0.333641 2025-07-16 17:25:37,636 - INFO - Optimizer step 20: log_σ²=0.439907, weight=0.644096 2025-07-16 17:26:00,549 - INFO - log_σ² gradient: -0.326792 2025-07-16 17:26:00,631 - INFO - Optimizer step 21: log_σ²=0.440255, weight=0.643872 2025-07-16 17:26:22,778 - INFO - log_σ² gradient: -0.328263 2025-07-16 17:26:22,856 - INFO - Optimizer step 22: log_σ²=0.440602, weight=0.643649 2025-07-16 17:26:43,844 - INFO - log_σ² gradient: -0.327882 2025-07-16 17:26:43,916 - INFO - Optimizer step 23: log_σ²=0.440948, weight=0.643426 2025-07-16 17:27:05,570 - INFO - log_σ² gradient: -0.330241 2025-07-16 17:27:05,641 - INFO - Optimizer step 24: log_σ²=0.441293, weight=0.643204 2025-07-16 17:27:27,916 - INFO - log_σ² gradient: -0.324707 2025-07-16 17:27:27,989 - INFO - Optimizer step 25: log_σ²=0.441637, weight=0.642983 2025-07-16 17:27:49,412 - INFO - log_σ² gradient: -0.329171 2025-07-16 17:27:49,491 - INFO - Optimizer step 26: log_σ²=0.441979, weight=0.642763 2025-07-16 17:28:10,567 - INFO - log_σ² gradient: -0.328159 2025-07-16 17:28:10,635 - INFO - Optimizer step 27: log_σ²=0.442321, weight=0.642543 2025-07-16 17:28:31,341 - INFO - log_σ² gradient: -0.328910 2025-07-16 17:28:31,412 - INFO - Optimizer step 28: log_σ²=0.442661, weight=0.642325 2025-07-16 17:28:52,997 - INFO - log_σ² gradient: -0.323821 2025-07-16 17:28:53,065 - INFO - Optimizer step 29: log_σ²=0.443000, weight=0.642107 2025-07-16 17:29:14,797 - INFO - log_σ² gradient: -0.335261 2025-07-16 17:29:14,877 - INFO - Optimizer step 30: log_σ²=0.443338, weight=0.641890 2025-07-16 17:29:36,877 - INFO - log_σ² gradient: -0.327349 2025-07-16 17:29:36,950 - INFO - Optimizer step 31: log_σ²=0.443675, weight=0.641674 2025-07-16 17:29:58,800 - INFO - log_σ² gradient: -0.327213 2025-07-16 17:29:58,875 - INFO - Optimizer step 32: log_σ²=0.444011, weight=0.641458 2025-07-16 17:30:21,045 - INFO - log_σ² gradient: -0.327055 2025-07-16 17:30:21,117 - INFO - Optimizer step 33: log_σ²=0.444346, weight=0.641244 2025-07-16 17:30:42,659 - INFO - log_σ² gradient: -0.325041 2025-07-16 17:30:42,727 - INFO - Optimizer step 34: log_σ²=0.444679, weight=0.641030 2025-07-16 17:31:04,334 - INFO - log_σ² gradient: -0.330686 2025-07-16 17:31:04,408 - INFO - Optimizer step 35: log_σ²=0.445011, weight=0.640817 2025-07-16 17:31:27,013 - INFO - log_σ² gradient: -0.329234 2025-07-16 17:31:27,088 - INFO - Optimizer step 36: log_σ²=0.445342, weight=0.640605 2025-07-16 17:31:48,647 - INFO - log_σ² gradient: -0.328410 2025-07-16 17:31:48,715 - INFO - Optimizer step 37: log_σ²=0.445672, weight=0.640394 2025-07-16 17:32:10,594 - INFO - log_σ² gradient: -0.328672 2025-07-16 17:32:10,672 - INFO - Optimizer step 38: log_σ²=0.446001, weight=0.640183 2025-07-16 17:32:32,435 - INFO - log_σ² gradient: -0.325563 2025-07-16 17:32:32,508 - INFO - Optimizer step 39: log_σ²=0.446329, weight=0.639973 2025-07-16 17:32:54,470 - INFO - log_σ² gradient: -0.329450 2025-07-16 17:32:54,541 - INFO - Optimizer step 40: log_σ²=0.446655, weight=0.639765 2025-07-16 17:33:15,595 - INFO - log_σ² gradient: -0.323641 2025-07-16 17:33:15,664 - INFO - Optimizer step 41: log_σ²=0.446980, weight=0.639557 2025-07-16 17:33:36,658 - INFO - log_σ² gradient: -0.331680 2025-07-16 17:33:36,732 - INFO - Optimizer step 42: log_σ²=0.447304, weight=0.639350 2025-07-16 17:33:57,549 - INFO - log_σ² gradient: -0.321976 2025-07-16 17:33:57,620 - INFO - Optimizer step 43: log_σ²=0.447626, weight=0.639144 2025-07-16 17:34:19,060 - INFO - log_σ² gradient: -0.327940 2025-07-16 17:34:19,129 - INFO - Optimizer step 44: log_σ²=0.447947, weight=0.638938 2025-07-16 17:34:40,269 - INFO - log_σ² gradient: -0.320834 2025-07-16 17:34:40,335 - INFO - Optimizer step 45: log_σ²=0.448266, weight=0.638735 2025-07-16 17:35:01,696 - INFO - log_σ² gradient: -0.326976 2025-07-16 17:35:01,773 - INFO - Optimizer step 46: log_σ²=0.448584, weight=0.638531 2025-07-16 17:35:23,374 - INFO - log_σ² gradient: -0.322877 2025-07-16 17:35:23,451 - INFO - Optimizer step 47: log_σ²=0.448901, weight=0.638329 2025-07-16 17:35:44,559 - INFO - log_σ² gradient: -0.324069 2025-07-16 17:35:44,636 - INFO - Optimizer step 48: log_σ²=0.449216, weight=0.638128 2025-07-16 17:35:55,049 - INFO - log_σ² gradient: -0.155916 2025-07-16 17:35:55,127 - INFO - Optimizer step 49: log_σ²=0.449514, weight=0.637938 2025-07-16 17:35:55,358 - INFO - Epoch 25: Total optimizer steps: 49 2025-07-16 17:38:51,071 - INFO - Validation metrics: 2025-07-16 17:38:51,071 - INFO - Loss: 0.4202 2025-07-16 17:38:51,071 - INFO - BCE Loss: 0.3267 2025-07-16 17:38:51,071 - INFO - Weighted BCE Loss: 0.2084 2025-07-16 17:38:51,071 - INFO - Average similarity: 0.7664 2025-07-16 17:38:51,071 - INFO - Median similarity: 0.7836 2025-07-16 17:38:51,071 - INFO - Clean sample similarity: 0.7664 2025-07-16 17:38:51,071 - INFO - Corrupted sample similarity: 0.3733 2025-07-16 17:38:51,071 - INFO - Similarity gap (clean - corrupt): 0.3932 2025-07-16 17:38:51,257 - INFO - Epoch 25/30 - Train Loss: 0.4474, Val Loss: 0.4202, Val BCE: 0.3267, Val wBCE: 0.2084, Clean Sim: 0.7664, Corrupt Sim: 0.3733, Gap: 0.3932, Time: 1246.85s 2025-07-16 17:38:51,257 - INFO - New best validation loss: 0.4202 2025-07-16 17:38:54,679 - INFO - New best similarity gap: 0.3932 2025-07-16 17:39:29,197 - INFO - log_σ² gradient: -0.322956 2025-07-16 17:39:29,268 - INFO - Optimizer step 1: log_σ²=0.449811, weight=0.637748 2025-07-16 17:39:50,207 - INFO - log_σ² gradient: -0.325651 2025-07-16 17:39:50,280 - INFO - Optimizer step 2: log_σ²=0.450109, weight=0.637558 2025-07-16 17:40:11,927 - INFO - log_σ² gradient: -0.323758 2025-07-16 17:40:11,996 - INFO - Optimizer step 3: log_σ²=0.450407, weight=0.637368 2025-07-16 17:40:32,715 - INFO - log_σ² gradient: -0.325859 2025-07-16 17:40:32,790 - INFO - Optimizer step 4: log_σ²=0.450706, weight=0.637178 2025-07-16 17:40:53,150 - INFO - log_σ² gradient: -0.317543 2025-07-16 17:40:53,218 - INFO - Optimizer step 5: log_σ²=0.451003, weight=0.636989 2025-07-16 17:41:14,491 - INFO - log_σ² gradient: -0.324006 2025-07-16 17:41:14,564 - INFO - Optimizer step 6: log_σ²=0.451300, weight=0.636800 2025-07-16 17:41:35,815 - INFO - log_σ² gradient: -0.324719 2025-07-16 17:41:35,885 - INFO - Optimizer step 7: log_σ²=0.451597, weight=0.636611 2025-07-16 17:41:56,342 - INFO - log_σ² gradient: -0.327901 2025-07-16 17:41:56,418 - INFO - Optimizer step 8: log_σ²=0.451893, weight=0.636422 2025-07-16 17:42:18,593 - INFO - log_σ² gradient: -0.328148 2025-07-16 17:42:18,672 - INFO - Optimizer step 9: log_σ²=0.452190, weight=0.636233 2025-07-16 17:42:41,236 - INFO - log_σ² gradient: -0.323130 2025-07-16 17:42:41,310 - INFO - Optimizer step 10: log_σ²=0.452485, weight=0.636045 2025-07-16 17:43:01,667 - INFO - log_σ² gradient: -0.324681 2025-07-16 17:43:01,734 - INFO - Optimizer step 11: log_σ²=0.452780, weight=0.635858 2025-07-16 17:43:24,907 - INFO - log_σ² gradient: -0.320047 2025-07-16 17:43:24,985 - INFO - Optimizer step 12: log_σ²=0.453074, weight=0.635671 2025-07-16 17:43:46,383 - INFO - log_σ² gradient: -0.323216 2025-07-16 17:43:46,453 - INFO - Optimizer step 13: log_σ²=0.453367, weight=0.635485 2025-07-16 17:44:08,043 - INFO - log_σ² gradient: -0.322960 2025-07-16 17:44:08,114 - INFO - Optimizer step 14: log_σ²=0.453659, weight=0.635299 2025-07-16 17:44:28,573 - INFO - log_σ² gradient: -0.326462 2025-07-16 17:44:28,643 - INFO - Optimizer step 15: log_σ²=0.453951, weight=0.635114 2025-07-16 17:44:49,864 - INFO - log_σ² gradient: -0.322961 2025-07-16 17:44:49,937 - INFO - Optimizer step 16: log_σ²=0.454241, weight=0.634929 2025-07-16 17:45:12,183 - INFO - log_σ² gradient: -0.320800 2025-07-16 17:45:12,256 - INFO - Optimizer step 17: log_σ²=0.454531, weight=0.634746 2025-07-16 17:45:33,466 - INFO - log_σ² gradient: -0.325773 2025-07-16 17:45:33,534 - INFO - Optimizer step 18: log_σ²=0.454819, weight=0.634563 2025-07-16 17:45:53,893 - INFO - log_σ² gradient: -0.326931 2025-07-16 17:45:53,961 - INFO - Optimizer step 19: log_σ²=0.455107, weight=0.634380 2025-07-16 17:46:16,401 - INFO - log_σ² gradient: -0.322485 2025-07-16 17:46:16,479 - INFO - Optimizer step 20: log_σ²=0.455394, weight=0.634198 2025-07-16 17:46:37,844 - INFO - log_σ² gradient: -0.321992 2025-07-16 17:46:37,913 - INFO - Optimizer step 21: log_σ²=0.455679, weight=0.634017 2025-07-16 17:46:58,959 - INFO - log_σ² gradient: -0.321438 2025-07-16 17:46:59,037 - INFO - Optimizer step 22: log_σ²=0.455963, weight=0.633837 2025-07-16 17:47:20,752 - INFO - log_σ² gradient: -0.325095 2025-07-16 17:47:20,830 - INFO - Optimizer step 23: log_σ²=0.456247, weight=0.633658 2025-07-16 17:47:42,190 - INFO - log_σ² gradient: -0.327689 2025-07-16 17:47:42,259 - INFO - Optimizer step 24: log_σ²=0.456529, weight=0.633478 2025-07-16 17:48:03,620 - INFO - log_σ² gradient: -0.325889 2025-07-16 17:48:03,695 - INFO - Optimizer step 25: log_σ²=0.456811, weight=0.633300 2025-07-16 17:48:25,186 - INFO - log_σ² gradient: -0.326827 2025-07-16 17:48:25,255 - INFO - Optimizer step 26: log_σ²=0.457092, weight=0.633122 2025-07-16 17:48:47,589 - INFO - log_σ² gradient: -0.322323 2025-07-16 17:48:47,659 - INFO - Optimizer step 27: log_σ²=0.457371, weight=0.632945 2025-07-16 17:49:09,362 - INFO - log_σ² gradient: -0.324809 2025-07-16 17:49:09,435 - INFO - Optimizer step 28: log_σ²=0.457650, weight=0.632769 2025-07-16 17:49:31,895 - INFO - log_σ² gradient: -0.317985 2025-07-16 17:49:31,966 - INFO - Optimizer step 29: log_σ²=0.457926, weight=0.632594 2025-07-16 17:49:53,288 - INFO - log_σ² gradient: -0.320271 2025-07-16 17:49:53,362 - INFO - Optimizer step 30: log_σ²=0.458202, weight=0.632420 2025-07-16 17:50:14,923 - INFO - log_σ² gradient: -0.319252 2025-07-16 17:50:14,993 - INFO - Optimizer step 31: log_σ²=0.458475, weight=0.632247 2025-07-16 17:50:37,007 - INFO - log_σ² gradient: -0.319040 2025-07-16 17:50:37,086 - INFO - Optimizer step 32: log_σ²=0.458748, weight=0.632075 2025-07-16 17:50:59,504 - INFO - log_σ² gradient: -0.319906 2025-07-16 17:50:59,577 - INFO - Optimizer step 33: log_σ²=0.459019, weight=0.631903 2025-07-16 17:51:21,898 - INFO - log_σ² gradient: -0.322770 2025-07-16 17:51:21,969 - INFO - Optimizer step 34: log_σ²=0.459288, weight=0.631733 2025-07-16 17:51:43,429 - INFO - log_σ² gradient: -0.323554 2025-07-16 17:51:43,502 - INFO - Optimizer step 35: log_σ²=0.459557, weight=0.631563 2025-07-16 17:52:03,780 - INFO - log_σ² gradient: -0.324521 2025-07-16 17:52:03,850 - INFO - Optimizer step 36: log_σ²=0.459825, weight=0.631394 2025-07-16 17:52:25,612 - INFO - log_σ² gradient: -0.323864 2025-07-16 17:52:25,682 - INFO - Optimizer step 37: log_σ²=0.460092, weight=0.631226 2025-07-16 17:52:47,488 - INFO - log_σ² gradient: -0.323620 2025-07-16 17:52:47,556 - INFO - Optimizer step 38: log_σ²=0.460357, weight=0.631058 2025-07-16 17:53:08,001 - INFO - log_σ² gradient: -0.320549 2025-07-16 17:53:08,075 - INFO - Optimizer step 39: log_σ²=0.460622, weight=0.630891 2025-07-16 17:53:29,649 - INFO - log_σ² gradient: -0.322247 2025-07-16 17:53:29,724 - INFO - Optimizer step 40: log_σ²=0.460885, weight=0.630725 2025-07-16 17:53:51,087 - INFO - log_σ² gradient: -0.322427 2025-07-16 17:53:51,164 - INFO - Optimizer step 41: log_σ²=0.461147, weight=0.630560 2025-07-16 17:54:13,856 - INFO - log_σ² gradient: -0.331722 2025-07-16 17:54:13,932 - INFO - Optimizer step 42: log_σ²=0.461408, weight=0.630395 2025-07-16 17:54:34,821 - INFO - log_σ² gradient: -0.328387 2025-07-16 17:54:34,891 - INFO - Optimizer step 43: log_σ²=0.461669, weight=0.630231 2025-07-16 17:54:56,583 - INFO - log_σ² gradient: -0.329363 2025-07-16 17:54:56,658 - INFO - Optimizer step 44: log_σ²=0.461928, weight=0.630067 2025-07-16 17:55:17,987 - INFO - log_σ² gradient: -0.331455 2025-07-16 17:55:18,057 - INFO - Optimizer step 45: log_σ²=0.462188, weight=0.629904 2025-07-16 17:55:39,017 - INFO - log_σ² gradient: -0.326687 2025-07-16 17:55:39,083 - INFO - Optimizer step 46: log_σ²=0.462446, weight=0.629742 2025-07-16 17:56:00,697 - INFO - log_σ² gradient: -0.316881 2025-07-16 17:56:00,768 - INFO - Optimizer step 47: log_σ²=0.462702, weight=0.629580 2025-07-16 17:56:22,203 - INFO - log_σ² gradient: -0.314152 2025-07-16 17:56:22,282 - INFO - Optimizer step 48: log_σ²=0.462956, weight=0.629420 2025-07-16 17:56:32,691 - INFO - log_σ² gradient: -0.157714 2025-07-16 17:56:32,764 - INFO - Optimizer step 49: log_σ²=0.463196, weight=0.629269 2025-07-16 17:56:32,980 - INFO - Epoch 26: Total optimizer steps: 49 2025-07-16 17:59:29,595 - INFO - Validation metrics: 2025-07-16 17:59:29,596 - INFO - Loss: 0.4191 2025-07-16 17:59:29,596 - INFO - BCE Loss: 0.3272 2025-07-16 17:59:29,596 - INFO - Weighted BCE Loss: 0.2059 2025-07-16 17:59:29,596 - INFO - Average similarity: 0.7106 2025-07-16 17:59:29,596 - INFO - Median similarity: 0.7228 2025-07-16 17:59:29,596 - INFO - Clean sample similarity: 0.7106 2025-07-16 17:59:29,596 - INFO - Corrupted sample similarity: 0.3427 2025-07-16 17:59:29,596 - INFO - Similarity gap (clean - corrupt): 0.3679 2025-07-16 17:59:29,741 - INFO - Epoch 26/30 - Train Loss: 0.4434, Val Loss: 0.4191, Val BCE: 0.3272, Val wBCE: 0.2059, Clean Sim: 0.7106, Corrupt Sim: 0.3427, Gap: 0.3679, Time: 1231.67s 2025-07-16 17:59:29,741 - INFO - New best validation loss: 0.4191 2025-07-16 18:02:16,026 - INFO - Epoch 26 Validation Alignment: Pos=0.236, Neg=0.129, Gap=0.106 2025-07-16 18:02:48,189 - INFO - log_σ² gradient: -0.322103 2025-07-16 18:02:48,267 - INFO - Optimizer step 1: log_σ²=0.463436, weight=0.629118 2025-07-16 18:03:08,283 - INFO - log_σ² gradient: -0.325900 2025-07-16 18:03:08,354 - INFO - Optimizer step 2: log_σ²=0.463676, weight=0.628967 2025-07-16 18:03:29,756 - INFO - log_σ² gradient: -0.321206 2025-07-16 18:03:29,831 - INFO - Optimizer step 3: log_σ²=0.463916, weight=0.628816 2025-07-16 18:03:50,474 - INFO - log_σ² gradient: -0.320881 2025-07-16 18:03:50,553 - INFO - Optimizer step 4: log_σ²=0.464156, weight=0.628666 2025-07-16 18:04:11,989 - INFO - log_σ² gradient: -0.325678 2025-07-16 18:04:12,061 - INFO - Optimizer step 5: log_σ²=0.464395, weight=0.628515 2025-07-16 18:04:34,249 - INFO - log_σ² gradient: -0.321775 2025-07-16 18:04:34,328 - INFO - Optimizer step 6: log_σ²=0.464634, weight=0.628365 2025-07-16 18:04:55,638 - INFO - log_σ² gradient: -0.315580 2025-07-16 18:04:55,716 - INFO - Optimizer step 7: log_σ²=0.464872, weight=0.628215 2025-07-16 18:05:17,313 - INFO - log_σ² gradient: -0.321151 2025-07-16 18:05:17,393 - INFO - Optimizer step 8: log_σ²=0.465109, weight=0.628067 2025-07-16 18:05:38,985 - INFO - log_σ² gradient: -0.321865 2025-07-16 18:05:39,063 - INFO - Optimizer step 9: log_σ²=0.465346, weight=0.627918 2025-07-16 18:06:00,315 - INFO - log_σ² gradient: -0.326785 2025-07-16 18:06:00,389 - INFO - Optimizer step 10: log_σ²=0.465582, weight=0.627770 2025-07-16 18:06:22,251 - INFO - log_σ² gradient: -0.317410 2025-07-16 18:06:22,323 - INFO - Optimizer step 11: log_σ²=0.465817, weight=0.627622 2025-07-16 18:06:44,188 - INFO - log_σ² gradient: -0.325314 2025-07-16 18:06:44,262 - INFO - Optimizer step 12: log_σ²=0.466051, weight=0.627475 2025-07-16 18:07:06,005 - INFO - log_σ² gradient: -0.324260 2025-07-16 18:07:06,081 - INFO - Optimizer step 13: log_σ²=0.466285, weight=0.627329 2025-07-16 18:07:27,665 - INFO - log_σ² gradient: -0.321646 2025-07-16 18:07:27,735 - INFO - Optimizer step 14: log_σ²=0.466517, weight=0.627183 2025-07-16 18:07:49,501 - INFO - log_σ² gradient: -0.319124 2025-07-16 18:07:49,567 - INFO - Optimizer step 15: log_σ²=0.466749, weight=0.627038 2025-07-16 18:08:10,074 - INFO - log_σ² gradient: -0.318753 2025-07-16 18:08:10,144 - INFO - Optimizer step 16: log_σ²=0.466979, weight=0.626893 2025-07-16 18:08:31,827 - INFO - log_σ² gradient: -0.323607 2025-07-16 18:08:31,898 - INFO - Optimizer step 17: log_σ²=0.467208, weight=0.626749 2025-07-16 18:08:52,282 - INFO - log_σ² gradient: -0.322269 2025-07-16 18:08:52,354 - INFO - Optimizer step 18: log_σ²=0.467437, weight=0.626606 2025-07-16 18:09:13,848 - INFO - log_σ² gradient: -0.318548 2025-07-16 18:09:13,920 - INFO - Optimizer step 19: log_σ²=0.467664, weight=0.626464 2025-07-16 18:09:35,415 - INFO - log_σ² gradient: -0.318708 2025-07-16 18:09:35,493 - INFO - Optimizer step 20: log_σ²=0.467890, weight=0.626323 2025-07-16 18:09:57,104 - INFO - log_σ² gradient: -0.322355 2025-07-16 18:09:57,171 - INFO - Optimizer step 21: log_σ²=0.468115, weight=0.626182 2025-07-16 18:10:19,017 - INFO - log_σ² gradient: -0.327664 2025-07-16 18:10:19,087 - INFO - Optimizer step 22: log_σ²=0.468339, weight=0.626041 2025-07-16 18:10:40,802 - INFO - log_σ² gradient: -0.313838 2025-07-16 18:10:40,871 - INFO - Optimizer step 23: log_σ²=0.468561, weight=0.625902 2025-07-16 18:11:03,152 - INFO - log_σ² gradient: -0.322479 2025-07-16 18:11:03,230 - INFO - Optimizer step 24: log_σ²=0.468783, weight=0.625763 2025-07-16 18:11:24,047 - INFO - log_σ² gradient: -0.316410 2025-07-16 18:11:24,116 - INFO - Optimizer step 25: log_σ²=0.469003, weight=0.625626 2025-07-16 18:11:45,563 - INFO - log_σ² gradient: -0.320667 2025-07-16 18:11:45,634 - INFO - Optimizer step 26: log_σ²=0.469222, weight=0.625489 2025-07-16 18:12:06,537 - INFO - log_σ² gradient: -0.325897 2025-07-16 18:12:06,611 - INFO - Optimizer step 27: log_σ²=0.469440, weight=0.625352 2025-07-16 18:12:28,693 - INFO - log_σ² gradient: -0.322635 2025-07-16 18:12:28,770 - INFO - Optimizer step 28: log_σ²=0.469657, weight=0.625217 2025-07-16 18:12:49,795 - INFO - log_σ² gradient: -0.316460 2025-07-16 18:12:49,873 - INFO - Optimizer step 29: log_σ²=0.469872, weight=0.625082 2025-07-16 18:13:11,207 - INFO - log_σ² gradient: -0.323870 2025-07-16 18:13:11,282 - INFO - Optimizer step 30: log_σ²=0.470087, weight=0.624948 2025-07-16 18:13:33,043 - INFO - log_σ² gradient: -0.318264 2025-07-16 18:13:33,117 - INFO - Optimizer step 31: log_σ²=0.470300, weight=0.624815 2025-07-16 18:13:54,685 - INFO - log_σ² gradient: -0.320969 2025-07-16 18:13:54,757 - INFO - Optimizer step 32: log_σ²=0.470512, weight=0.624682 2025-07-16 18:14:17,910 - INFO - log_σ² gradient: -0.313777 2025-07-16 18:14:17,988 - INFO - Optimizer step 33: log_σ²=0.470722, weight=0.624551 2025-07-16 18:14:40,294 - INFO - log_σ² gradient: -0.318975 2025-07-16 18:14:40,367 - INFO - Optimizer step 34: log_σ²=0.470931, weight=0.624420 2025-07-16 18:15:01,897 - INFO - log_σ² gradient: -0.316873 2025-07-16 18:15:01,969 - INFO - Optimizer step 35: log_σ²=0.471139, weight=0.624291 2025-07-16 18:15:23,578 - INFO - log_σ² gradient: -0.317007 2025-07-16 18:15:23,646 - INFO - Optimizer step 36: log_σ²=0.471345, weight=0.624162 2025-07-16 18:15:46,174 - INFO - log_σ² gradient: -0.313413 2025-07-16 18:15:46,248 - INFO - Optimizer step 37: log_σ²=0.471550, weight=0.624034 2025-07-16 18:16:08,944 - INFO - log_σ² gradient: -0.317898 2025-07-16 18:16:09,017 - INFO - Optimizer step 38: log_σ²=0.471753, weight=0.623907 2025-07-16 18:16:29,768 - INFO - log_σ² gradient: -0.329691 2025-07-16 18:16:29,846 - INFO - Optimizer step 39: log_σ²=0.471956, weight=0.623781 2025-07-16 18:16:50,501 - INFO - log_σ² gradient: -0.317251 2025-07-16 18:16:50,571 - INFO - Optimizer step 40: log_σ²=0.472158, weight=0.623655 2025-07-16 18:17:11,585 - INFO - log_σ² gradient: -0.315997 2025-07-16 18:17:11,653 - INFO - Optimizer step 41: log_σ²=0.472358, weight=0.623530 2025-07-16 18:17:33,301 - INFO - log_σ² gradient: -0.321237 2025-07-16 18:17:33,377 - INFO - Optimizer step 42: log_σ²=0.472557, weight=0.623406 2025-07-16 18:17:54,981 - INFO - log_σ² gradient: -0.314211 2025-07-16 18:17:55,054 - INFO - Optimizer step 43: log_σ²=0.472754, weight=0.623283 2025-07-16 18:18:16,267 - INFO - log_σ² gradient: -0.323492 2025-07-16 18:18:16,341 - INFO - Optimizer step 44: log_σ²=0.472951, weight=0.623161 2025-07-16 18:18:40,094 - INFO - log_σ² gradient: -0.313404 2025-07-16 18:18:40,177 - INFO - Optimizer step 45: log_σ²=0.473146, weight=0.623039 2025-07-16 18:19:00,727 - INFO - log_σ² gradient: -0.315326 2025-07-16 18:19:00,796 - INFO - Optimizer step 46: log_σ²=0.473339, weight=0.622919 2025-07-16 18:19:22,485 - INFO - log_σ² gradient: -0.316575 2025-07-16 18:19:22,560 - INFO - Optimizer step 47: log_σ²=0.473532, weight=0.622799 2025-07-16 18:19:44,926 - INFO - log_σ² gradient: -0.312694 2025-07-16 18:19:44,999 - INFO - Optimizer step 48: log_σ²=0.473722, weight=0.622680 2025-07-16 18:19:55,641 - INFO - log_σ² gradient: -0.149482 2025-07-16 18:19:55,708 - INFO - Optimizer step 49: log_σ²=0.473901, weight=0.622569 2025-07-16 18:19:55,972 - INFO - Epoch 27: Total optimizer steps: 49 2025-07-16 18:22:55,794 - INFO - Validation metrics: 2025-07-16 18:22:55,794 - INFO - Loss: 0.4048 2025-07-16 18:22:55,794 - INFO - BCE Loss: 0.3183 2025-07-16 18:22:55,795 - INFO - Weighted BCE Loss: 0.1982 2025-07-16 18:22:55,795 - INFO - Average similarity: 0.7394 2025-07-16 18:22:55,795 - INFO - Median similarity: 0.7573 2025-07-16 18:22:55,795 - INFO - Clean sample similarity: 0.7394 2025-07-16 18:22:55,795 - INFO - Corrupted sample similarity: 0.3443 2025-07-16 18:22:55,795 - INFO - Similarity gap (clean - corrupt): 0.3951 2025-07-16 18:22:56,009 - INFO - Epoch 27/30 - Train Loss: 0.4368, Val Loss: 0.4048, Val BCE: 0.3183, Val wBCE: 0.1982, Clean Sim: 0.7394, Corrupt Sim: 0.3443, Gap: 0.3951, Time: 1239.98s 2025-07-16 18:22:56,010 - INFO - New best validation loss: 0.4048 2025-07-16 18:22:59,438 - INFO - New best similarity gap: 0.3951 2025-07-16 18:23:33,828 - INFO - log_σ² gradient: -0.318314 2025-07-16 18:23:33,901 - INFO - Optimizer step 1: log_σ²=0.474081, weight=0.622457 2025-07-16 18:23:56,548 - INFO - log_σ² gradient: -0.320577 2025-07-16 18:23:56,618 - INFO - Optimizer step 2: log_σ²=0.474260, weight=0.622346 2025-07-16 18:24:17,295 - INFO - log_σ² gradient: -0.322590 2025-07-16 18:24:17,367 - INFO - Optimizer step 3: log_σ²=0.474439, weight=0.622234 2025-07-16 18:24:40,417 - INFO - log_σ² gradient: -0.317877 2025-07-16 18:24:40,491 - INFO - Optimizer step 4: log_σ²=0.474617, weight=0.622123 2025-07-16 18:25:01,079 - INFO - log_σ² gradient: -0.312603 2025-07-16 18:25:01,157 - INFO - Optimizer step 5: log_σ²=0.474795, weight=0.622012 2025-07-16 18:25:24,118 - INFO - log_σ² gradient: -0.314375 2025-07-16 18:25:24,188 - INFO - Optimizer step 6: log_σ²=0.474972, weight=0.621903 2025-07-16 18:25:45,058 - INFO - log_σ² gradient: -0.314455 2025-07-16 18:25:45,135 - INFO - Optimizer step 7: log_σ²=0.475148, weight=0.621793 2025-07-16 18:26:08,754 - INFO - log_σ² gradient: -0.313424 2025-07-16 18:26:08,840 - INFO - Optimizer step 8: log_σ²=0.475323, weight=0.621684 2025-07-16 18:26:31,086 - INFO - log_σ² gradient: -0.315986 2025-07-16 18:26:31,163 - INFO - Optimizer step 9: log_σ²=0.475497, weight=0.621576 2025-07-16 18:26:55,032 - INFO - log_σ² gradient: -0.313598 2025-07-16 18:26:55,110 - INFO - Optimizer step 10: log_σ²=0.475670, weight=0.621468 2025-07-16 18:27:17,965 - INFO - log_σ² gradient: -0.323455 2025-07-16 18:27:18,037 - INFO - Optimizer step 11: log_σ²=0.475843, weight=0.621361 2025-07-16 18:27:40,370 - INFO - log_σ² gradient: -0.312659 2025-07-16 18:27:40,452 - INFO - Optimizer step 12: log_σ²=0.476015, weight=0.621254 2025-07-16 18:28:03,054 - INFO - log_σ² gradient: -0.313841 2025-07-16 18:28:03,129 - INFO - Optimizer step 13: log_σ²=0.476185, weight=0.621149 2025-07-16 18:28:26,481 - INFO - log_σ² gradient: -0.313081 2025-07-16 18:28:26,565 - INFO - Optimizer step 14: log_σ²=0.476354, weight=0.621043 2025-07-16 18:28:50,092 - INFO - log_σ² gradient: -0.317609 2025-07-16 18:28:50,166 - INFO - Optimizer step 15: log_σ²=0.476523, weight=0.620939 2025-07-16 18:29:13,402 - INFO - log_σ² gradient: -0.317853 2025-07-16 18:29:13,478 - INFO - Optimizer step 16: log_σ²=0.476690, weight=0.620835 2025-07-16 18:29:35,184 - INFO - log_σ² gradient: -0.311518 2025-07-16 18:29:35,265 - INFO - Optimizer step 17: log_σ²=0.476856, weight=0.620732 2025-07-16 18:29:58,038 - INFO - log_σ² gradient: -0.318774 2025-07-16 18:29:58,113 - INFO - Optimizer step 18: log_σ²=0.477021, weight=0.620629 2025-07-16 18:30:20,622 - INFO - log_σ² gradient: -0.313498 2025-07-16 18:30:20,695 - INFO - Optimizer step 19: log_σ²=0.477185, weight=0.620527 2025-07-16 18:30:43,581 - INFO - log_σ² gradient: -0.321497 2025-07-16 18:30:43,659 - INFO - Optimizer step 20: log_σ²=0.477349, weight=0.620426 2025-07-16 18:31:05,447 - INFO - log_σ² gradient: -0.318566 2025-07-16 18:31:05,520 - INFO - Optimizer step 21: log_σ²=0.477511, weight=0.620326 2025-07-16 18:31:28,264 - INFO - log_σ² gradient: -0.323293 2025-07-16 18:31:28,339 - INFO - Optimizer step 22: log_σ²=0.477672, weight=0.620226 2025-07-16 18:31:50,546 - INFO - log_σ² gradient: -0.318012 2025-07-16 18:31:50,614 - INFO - Optimizer step 23: log_σ²=0.477832, weight=0.620126 2025-07-16 18:32:11,604 - INFO - log_σ² gradient: -0.311018 2025-07-16 18:32:11,674 - INFO - Optimizer step 24: log_σ²=0.477991, weight=0.620028 2025-07-16 18:32:34,135 - INFO - log_σ² gradient: -0.316894 2025-07-16 18:32:34,203 - INFO - Optimizer step 25: log_σ²=0.478149, weight=0.619930 2025-07-16 18:32:54,916 - INFO - log_σ² gradient: -0.313898 2025-07-16 18:32:54,986 - INFO - Optimizer step 26: log_σ²=0.478305, weight=0.619833 2025-07-16 18:33:16,531 - INFO - log_σ² gradient: -0.312950 2025-07-16 18:33:16,604 - INFO - Optimizer step 27: log_σ²=0.478460, weight=0.619737 2025-07-16 18:33:39,673 - INFO - log_σ² gradient: -0.314717 2025-07-16 18:33:39,743 - INFO - Optimizer step 28: log_σ²=0.478613, weight=0.619642 2025-07-16 18:34:02,143 - INFO - log_σ² gradient: -0.319450 2025-07-16 18:34:02,217 - INFO - Optimizer step 29: log_σ²=0.478766, weight=0.619547 2025-07-16 18:34:25,788 - INFO - log_σ² gradient: -0.314998 2025-07-16 18:34:25,862 - INFO - Optimizer step 30: log_σ²=0.478917, weight=0.619454 2025-07-16 18:34:47,530 - INFO - log_σ² gradient: -0.315387 2025-07-16 18:34:47,601 - INFO - Optimizer step 31: log_σ²=0.479067, weight=0.619361 2025-07-16 18:35:09,466 - INFO - log_σ² gradient: -0.312432 2025-07-16 18:35:09,539 - INFO - Optimizer step 32: log_σ²=0.479216, weight=0.619269 2025-07-16 18:35:30,703 - INFO - log_σ² gradient: -0.317507 2025-07-16 18:35:30,773 - INFO - Optimizer step 33: log_σ²=0.479364, weight=0.619177 2025-07-16 18:35:53,094 - INFO - log_σ² gradient: -0.310516 2025-07-16 18:35:53,164 - INFO - Optimizer step 34: log_σ²=0.479510, weight=0.619087 2025-07-16 18:36:15,184 - INFO - log_σ² gradient: -0.318876 2025-07-16 18:36:15,256 - INFO - Optimizer step 35: log_σ²=0.479655, weight=0.618997 2025-07-16 18:36:37,463 - INFO - log_σ² gradient: -0.316022 2025-07-16 18:36:37,533 - INFO - Optimizer step 36: log_σ²=0.479799, weight=0.618908 2025-07-16 18:37:00,338 - INFO - log_σ² gradient: -0.319270 2025-07-16 18:37:00,411 - INFO - Optimizer step 37: log_σ²=0.479941, weight=0.618820 2025-07-16 18:37:21,672 - INFO - log_σ² gradient: -0.316828 2025-07-16 18:37:21,743 - INFO - Optimizer step 38: log_σ²=0.480083, weight=0.618732 2025-07-16 18:37:43,290 - INFO - log_σ² gradient: -0.311760 2025-07-16 18:37:43,369 - INFO - Optimizer step 39: log_σ²=0.480223, weight=0.618645 2025-07-16 18:38:05,304 - INFO - log_σ² gradient: -0.312900 2025-07-16 18:38:05,379 - INFO - Optimizer step 40: log_σ²=0.480362, weight=0.618559 2025-07-16 18:38:28,393 - INFO - log_σ² gradient: -0.309953 2025-07-16 18:38:28,466 - INFO - Optimizer step 41: log_σ²=0.480499, weight=0.618474 2025-07-16 18:38:50,115 - INFO - log_σ² gradient: -0.311825 2025-07-16 18:38:50,190 - INFO - Optimizer step 42: log_σ²=0.480635, weight=0.618390 2025-07-16 18:39:13,846 - INFO - log_σ² gradient: -0.316002 2025-07-16 18:39:13,924 - INFO - Optimizer step 43: log_σ²=0.480770, weight=0.618307 2025-07-16 18:39:35,541 - INFO - log_σ² gradient: -0.319058 2025-07-16 18:39:35,615 - INFO - Optimizer step 44: log_σ²=0.480904, weight=0.618224 2025-07-16 18:39:57,474 - INFO - log_σ² gradient: -0.315300 2025-07-16 18:39:57,544 - INFO - Optimizer step 45: log_σ²=0.481036, weight=0.618143 2025-07-16 18:40:20,132 - INFO - log_σ² gradient: -0.314594 2025-07-16 18:40:20,203 - INFO - Optimizer step 46: log_σ²=0.481167, weight=0.618061 2025-07-16 18:40:42,052 - INFO - log_σ² gradient: -0.318473 2025-07-16 18:40:42,126 - INFO - Optimizer step 47: log_σ²=0.481298, weight=0.617981 2025-07-16 18:41:04,292 - INFO - log_σ² gradient: -0.313706 2025-07-16 18:41:04,364 - INFO - Optimizer step 48: log_σ²=0.481426, weight=0.617901 2025-07-16 18:41:13,801 - INFO - log_σ² gradient: -0.146704 2025-07-16 18:41:13,872 - INFO - Optimizer step 49: log_σ²=0.481547, weight=0.617827 2025-07-16 18:41:14,048 - INFO - Epoch 28: Total optimizer steps: 49 2025-07-16 18:44:12,903 - INFO - Validation metrics: 2025-07-16 18:44:12,903 - INFO - Loss: 0.4100 2025-07-16 18:44:12,903 - INFO - BCE Loss: 0.3171 2025-07-16 18:44:12,903 - INFO - Weighted BCE Loss: 0.1959 2025-07-16 18:44:12,903 - INFO - Average similarity: 0.7722 2025-07-16 18:44:12,903 - INFO - Median similarity: 0.7961 2025-07-16 18:44:12,903 - INFO - Clean sample similarity: 0.7722 2025-07-16 18:44:12,903 - INFO - Corrupted sample similarity: 0.3613 2025-07-16 18:44:12,903 - INFO - Similarity gap (clean - corrupt): 0.4109 2025-07-16 18:44:13,128 - INFO - Epoch 28/30 - Train Loss: 0.4317, Val Loss: 0.4100, Val BCE: 0.3171, Val wBCE: 0.1959, Clean Sim: 0.7722, Corrupt Sim: 0.3613, Gap: 0.4109, Time: 1270.32s 2025-07-16 18:44:13,128 - INFO - New best similarity gap: 0.4109 2025-07-16 18:47:06,349 - INFO - Epoch 28 Validation Alignment: Pos=0.228, Neg=0.098, Gap=0.130 2025-07-16 18:47:43,099 - INFO - log_σ² gradient: -0.321795 2025-07-16 18:47:43,173 - INFO - Optimizer step 1: log_σ²=0.481667, weight=0.617752 2025-07-16 18:48:05,753 - INFO - log_σ² gradient: -0.320469 2025-07-16 18:48:05,827 - INFO - Optimizer step 2: log_σ²=0.481788, weight=0.617678 2025-07-16 18:48:28,891 - INFO - log_σ² gradient: -0.317235 2025-07-16 18:48:28,965 - INFO - Optimizer step 3: log_σ²=0.481907, weight=0.617604 2025-07-16 18:48:51,789 - INFO - log_σ² gradient: -0.317188 2025-07-16 18:48:51,863 - INFO - Optimizer step 4: log_σ²=0.482026, weight=0.617531 2025-07-16 18:49:13,984 - INFO - log_σ² gradient: -0.313716 2025-07-16 18:49:14,062 - INFO - Optimizer step 5: log_σ²=0.482144, weight=0.617458 2025-07-16 18:49:36,917 - INFO - log_σ² gradient: -0.315528 2025-07-16 18:49:36,987 - INFO - Optimizer step 6: log_σ²=0.482261, weight=0.617386 2025-07-16 18:49:58,109 - INFO - log_σ² gradient: -0.311781 2025-07-16 18:49:58,183 - INFO - Optimizer step 7: log_σ²=0.482376, weight=0.617315 2025-07-16 18:50:22,140 - INFO - log_σ² gradient: -0.310931 2025-07-16 18:50:22,215 - INFO - Optimizer step 8: log_σ²=0.482491, weight=0.617244 2025-07-16 18:50:44,009 - INFO - log_σ² gradient: -0.320565 2025-07-16 18:50:44,081 - INFO - Optimizer step 9: log_σ²=0.482605, weight=0.617173 2025-07-16 18:51:08,024 - INFO - log_σ² gradient: -0.308661 2025-07-16 18:51:08,102 - INFO - Optimizer step 10: log_σ²=0.482718, weight=0.617104 2025-07-16 18:51:30,315 - INFO - log_σ² gradient: -0.310485 2025-07-16 18:51:30,386 - INFO - Optimizer step 11: log_σ²=0.482830, weight=0.617035 2025-07-16 18:51:52,034 - INFO - log_σ² gradient: -0.311352 2025-07-16 18:51:52,105 - INFO - Optimizer step 12: log_σ²=0.482940, weight=0.616967 2025-07-16 18:52:14,218 - INFO - log_σ² gradient: -0.317498 2025-07-16 18:52:14,292 - INFO - Optimizer step 13: log_σ²=0.483050, weight=0.616899 2025-07-16 18:52:37,161 - INFO - log_σ² gradient: -0.316899 2025-07-16 18:52:37,231 - INFO - Optimizer step 14: log_σ²=0.483158, weight=0.616832 2025-07-16 18:53:00,102 - INFO - log_σ² gradient: -0.315905 2025-07-16 18:53:00,175 - INFO - Optimizer step 15: log_σ²=0.483266, weight=0.616766 2025-07-16 18:53:24,263 - INFO - log_σ² gradient: -0.315235 2025-07-16 18:53:24,340 - INFO - Optimizer step 16: log_σ²=0.483372, weight=0.616700 2025-07-16 18:53:47,426 - INFO - log_σ² gradient: -0.318999 2025-07-16 18:53:47,499 - INFO - Optimizer step 17: log_σ²=0.483477, weight=0.616636 2025-07-16 18:54:10,408 - INFO - log_σ² gradient: -0.311687 2025-07-16 18:54:10,483 - INFO - Optimizer step 18: log_σ²=0.483581, weight=0.616571 2025-07-16 18:54:33,789 - INFO - log_σ² gradient: -0.318782 2025-07-16 18:54:33,859 - INFO - Optimizer step 19: log_σ²=0.483684, weight=0.616508 2025-07-16 18:54:56,257 - INFO - log_σ² gradient: -0.312355 2025-07-16 18:54:56,330 - INFO - Optimizer step 20: log_σ²=0.483786, weight=0.616445 2025-07-16 18:55:19,166 - INFO - log_σ² gradient: -0.311967 2025-07-16 18:55:19,239 - INFO - Optimizer step 21: log_σ²=0.483886, weight=0.616383 2025-07-16 18:55:42,310 - INFO - log_σ² gradient: -0.314695 2025-07-16 18:55:42,383 - INFO - Optimizer step 22: log_σ²=0.483985, weight=0.616322 2025-07-16 18:56:05,156 - INFO - log_σ² gradient: -0.312973 2025-07-16 18:56:05,238 - INFO - Optimizer step 23: log_σ²=0.484083, weight=0.616262 2025-07-16 18:56:28,661 - INFO - log_σ² gradient: -0.316610 2025-07-16 18:56:28,731 - INFO - Optimizer step 24: log_σ²=0.484180, weight=0.616202 2025-07-16 18:56:51,207 - INFO - log_σ² gradient: -0.317070 2025-07-16 18:56:51,275 - INFO - Optimizer step 25: log_σ²=0.484276, weight=0.616143 2025-07-16 18:57:14,139 - INFO - log_σ² gradient: -0.321108 2025-07-16 18:57:14,209 - INFO - Optimizer step 26: log_σ²=0.484370, weight=0.616085 2025-07-16 18:57:37,735 - INFO - log_σ² gradient: -0.309868 2025-07-16 18:57:37,806 - INFO - Optimizer step 27: log_σ²=0.484463, weight=0.616028 2025-07-16 18:58:00,422 - INFO - log_σ² gradient: -0.318406 2025-07-16 18:58:00,492 - INFO - Optimizer step 28: log_σ²=0.484555, weight=0.615971 2025-07-16 18:58:22,652 - INFO - log_σ² gradient: -0.308848 2025-07-16 18:58:22,725 - INFO - Optimizer step 29: log_σ²=0.484646, weight=0.615915 2025-07-16 18:58:44,890 - INFO - log_σ² gradient: -0.315768 2025-07-16 18:58:44,960 - INFO - Optimizer step 30: log_σ²=0.484735, weight=0.615860 2025-07-16 18:59:08,600 - INFO - log_σ² gradient: -0.315963 2025-07-16 18:59:08,675 - INFO - Optimizer step 31: log_σ²=0.484823, weight=0.615806 2025-07-16 18:59:29,803 - INFO - log_σ² gradient: -0.315158 2025-07-16 18:59:29,877 - INFO - Optimizer step 32: log_σ²=0.484910, weight=0.615752 2025-07-16 18:59:51,864 - INFO - log_σ² gradient: -0.313606 2025-07-16 18:59:51,935 - INFO - Optimizer step 33: log_σ²=0.484996, weight=0.615700 2025-07-16 19:00:14,835 - INFO - log_σ² gradient: -0.312344 2025-07-16 19:00:14,913 - INFO - Optimizer step 34: log_σ²=0.485080, weight=0.615648 2025-07-16 19:00:37,856 - INFO - log_σ² gradient: -0.316963 2025-07-16 19:00:37,933 - INFO - Optimizer step 35: log_σ²=0.485163, weight=0.615597 2025-07-16 19:01:00,731 - INFO - log_σ² gradient: -0.316720 2025-07-16 19:01:00,809 - INFO - Optimizer step 36: log_σ²=0.485245, weight=0.615546 2025-07-16 19:01:22,868 - INFO - log_σ² gradient: -0.313797 2025-07-16 19:01:22,938 - INFO - Optimizer step 37: log_σ²=0.485326, weight=0.615497 2025-07-16 19:01:44,646 - INFO - log_σ² gradient: -0.312529 2025-07-16 19:01:44,717 - INFO - Optimizer step 38: log_σ²=0.485405, weight=0.615448 2025-07-16 19:02:06,435 - INFO - log_σ² gradient: -0.312155 2025-07-16 19:02:06,505 - INFO - Optimizer step 39: log_σ²=0.485483, weight=0.615400 2025-07-16 19:02:29,152 - INFO - log_σ² gradient: -0.311386 2025-07-16 19:02:29,228 - INFO - Optimizer step 40: log_σ²=0.485559, weight=0.615353 2025-07-16 19:02:51,580 - INFO - log_σ² gradient: -0.310420 2025-07-16 19:02:51,647 - INFO - Optimizer step 41: log_σ²=0.485635, weight=0.615307 2025-07-16 19:03:13,680 - INFO - log_σ² gradient: -0.320271 2025-07-16 19:03:13,751 - INFO - Optimizer step 42: log_σ²=0.485709, weight=0.615261 2025-07-16 19:03:36,790 - INFO - log_σ² gradient: -0.311972 2025-07-16 19:03:36,861 - INFO - Optimizer step 43: log_σ²=0.485781, weight=0.615216 2025-07-16 19:03:58,415 - INFO - log_σ² gradient: -0.312087 2025-07-16 19:03:58,487 - INFO - Optimizer step 44: log_σ²=0.485853, weight=0.615172 2025-07-16 19:04:23,159 - INFO - log_σ² gradient: -0.307423 2025-07-16 19:04:23,242 - INFO - Optimizer step 45: log_σ²=0.485923, weight=0.615129 2025-07-16 19:04:44,837 - INFO - log_σ² gradient: -0.318125 2025-07-16 19:04:44,916 - INFO - Optimizer step 46: log_σ²=0.485992, weight=0.615087 2025-07-16 19:05:07,400 - INFO - log_σ² gradient: -0.307826 2025-07-16 19:05:07,470 - INFO - Optimizer step 47: log_σ²=0.486059, weight=0.615046 2025-07-16 19:05:29,011 - INFO - log_σ² gradient: -0.309199 2025-07-16 19:05:29,081 - INFO - Optimizer step 48: log_σ²=0.486125, weight=0.615005 2025-07-16 19:05:39,264 - INFO - log_σ² gradient: -0.142850 2025-07-16 19:05:39,333 - INFO - Optimizer step 49: log_σ²=0.486186, weight=0.614967 2025-07-16 19:05:39,508 - INFO - Epoch 29: Total optimizer steps: 49 2025-07-16 19:08:38,885 - INFO - Validation metrics: 2025-07-16 19:08:38,885 - INFO - Loss: 0.4056 2025-07-16 19:08:38,885 - INFO - BCE Loss: 0.3153 2025-07-16 19:08:38,885 - INFO - Weighted BCE Loss: 0.1939 2025-07-16 19:08:38,885 - INFO - Average similarity: 0.7754 2025-07-16 19:08:38,885 - INFO - Median similarity: 0.7984 2025-07-16 19:08:38,885 - INFO - Clean sample similarity: 0.7754 2025-07-16 19:08:38,885 - INFO - Corrupted sample similarity: 0.3646 2025-07-16 19:08:38,885 - INFO - Similarity gap (clean - corrupt): 0.4108 2025-07-16 19:08:39,083 - INFO - Epoch 29/30 - Train Loss: 0.4315, Val Loss: 0.4056, Val BCE: 0.3153, Val wBCE: 0.1939, Clean Sim: 0.7754, Corrupt Sim: 0.3646, Gap: 0.4108, Time: 1292.73s 2025-07-16 19:09:13,035 - INFO - log_σ² gradient: -0.314014 2025-07-16 19:09:13,106 - INFO - Optimizer step 1: log_σ²=0.486247, weight=0.614930 2025-07-16 19:09:34,796 - INFO - log_σ² gradient: -0.309701 2025-07-16 19:09:34,873 - INFO - Optimizer step 2: log_σ²=0.486306, weight=0.614894 2025-07-16 19:09:57,721 - INFO - log_σ² gradient: -0.314239 2025-07-16 19:09:57,799 - INFO - Optimizer step 3: log_σ²=0.486365, weight=0.614858 2025-07-16 19:10:19,453 - INFO - log_σ² gradient: -0.316215 2025-07-16 19:10:19,527 - INFO - Optimizer step 4: log_σ²=0.486422, weight=0.614822 2025-07-16 19:10:42,207 - INFO - log_σ² gradient: -0.317716 2025-07-16 19:10:42,282 - INFO - Optimizer step 5: log_σ²=0.486479, weight=0.614787 2025-07-16 19:11:05,277 - INFO - log_σ² gradient: -0.303834 2025-07-16 19:11:05,345 - INFO - Optimizer step 6: log_σ²=0.486534, weight=0.614753 2025-07-16 19:11:27,473 - INFO - log_σ² gradient: -0.317756 2025-07-16 19:11:27,543 - INFO - Optimizer step 7: log_σ²=0.486589, weight=0.614720 2025-07-16 19:11:48,663 - INFO - log_σ² gradient: -0.320218 2025-07-16 19:11:48,733 - INFO - Optimizer step 8: log_σ²=0.486642, weight=0.614687 2025-07-16 19:12:11,136 - INFO - log_σ² gradient: -0.310824 2025-07-16 19:12:11,212 - INFO - Optimizer step 9: log_σ²=0.486694, weight=0.614655 2025-07-16 19:12:32,700 - INFO - log_σ² gradient: -0.310688 2025-07-16 19:12:32,773 - INFO - Optimizer step 10: log_σ²=0.486746, weight=0.614623 2025-07-16 19:12:55,548 - INFO - log_σ² gradient: -0.312969 2025-07-16 19:12:55,618 - INFO - Optimizer step 11: log_σ²=0.486795, weight=0.614593 2025-07-16 19:13:17,566 - INFO - log_σ² gradient: -0.313588 2025-07-16 19:13:17,632 - INFO - Optimizer step 12: log_σ²=0.486844, weight=0.614563 2025-07-16 19:13:39,737 - INFO - log_σ² gradient: -0.308658 2025-07-16 19:13:39,811 - INFO - Optimizer step 13: log_σ²=0.486892, weight=0.614534 2025-07-16 19:14:02,954 - INFO - log_σ² gradient: -0.315908 2025-07-16 19:14:03,027 - INFO - Optimizer step 14: log_σ²=0.486938, weight=0.614505 2025-07-16 19:14:24,253 - INFO - log_σ² gradient: -0.309891 2025-07-16 19:14:24,334 - INFO - Optimizer step 15: log_σ²=0.486983, weight=0.614478 2025-07-16 19:14:47,219 - INFO - log_σ² gradient: -0.317794 2025-07-16 19:14:47,291 - INFO - Optimizer step 16: log_σ²=0.487027, weight=0.614451 2025-07-16 19:15:09,832 - INFO - log_σ² gradient: -0.315314 2025-07-16 19:15:09,903 - INFO - Optimizer step 17: log_σ²=0.487070, weight=0.614424 2025-07-16 19:15:31,395 - INFO - log_σ² gradient: -0.315509 2025-07-16 19:15:31,466 - INFO - Optimizer step 18: log_σ²=0.487111, weight=0.614399 2025-07-16 19:15:54,133 - INFO - log_σ² gradient: -0.313225 2025-07-16 19:15:54,203 - INFO - Optimizer step 19: log_σ²=0.487151, weight=0.614374 2025-07-16 19:16:15,777 - INFO - log_σ² gradient: -0.310730 2025-07-16 19:16:15,852 - INFO - Optimizer step 20: log_σ²=0.487190, weight=0.614350 2025-07-16 19:16:39,105 - INFO - log_σ² gradient: -0.309071 2025-07-16 19:16:39,175 - INFO - Optimizer step 21: log_σ²=0.487228, weight=0.614327 2025-07-16 19:17:00,925 - INFO - log_σ² gradient: -0.318339 2025-07-16 19:17:01,003 - INFO - Optimizer step 22: log_σ²=0.487264, weight=0.614305 2025-07-16 19:17:24,578 - INFO - log_σ² gradient: -0.309540 2025-07-16 19:17:24,652 - INFO - Optimizer step 23: log_σ²=0.487299, weight=0.614283 2025-07-16 19:17:45,613 - INFO - log_σ² gradient: -0.315776 2025-07-16 19:17:45,691 - INFO - Optimizer step 24: log_σ²=0.487333, weight=0.614262 2025-07-16 19:18:07,903 - INFO - log_σ² gradient: -0.309321 2025-07-16 19:18:07,978 - INFO - Optimizer step 25: log_σ²=0.487366, weight=0.614242 2025-07-16 19:18:29,571 - INFO - log_σ² gradient: -0.309336 2025-07-16 19:18:29,649 - INFO - Optimizer step 26: log_σ²=0.487397, weight=0.614223 2025-07-16 19:18:50,426 - INFO - log_σ² gradient: -0.317816 2025-07-16 19:18:50,504 - INFO - Optimizer step 27: log_σ²=0.487427, weight=0.614205 2025-07-16 19:19:11,900 - INFO - log_σ² gradient: -0.311552 2025-07-16 19:19:11,977 - INFO - Optimizer step 28: log_σ²=0.487455, weight=0.614187 2025-07-16 19:19:33,349 - INFO - log_σ² gradient: -0.312716 2025-07-16 19:19:33,424 - INFO - Optimizer step 29: log_σ²=0.487483, weight=0.614170 2025-07-16 19:19:55,226 - INFO - log_σ² gradient: -0.317215 2025-07-16 19:19:55,299 - INFO - Optimizer step 30: log_σ²=0.487509, weight=0.614154 2025-07-16 19:20:18,003 - INFO - log_σ² gradient: -0.313808 2025-07-16 19:20:18,075 - INFO - Optimizer step 31: log_σ²=0.487534, weight=0.614139 2025-07-16 19:20:40,089 - INFO - log_σ² gradient: -0.314818 2025-07-16 19:20:40,162 - INFO - Optimizer step 32: log_σ²=0.487557, weight=0.614125 2025-07-16 19:21:02,465 - INFO - log_σ² gradient: -0.311942 2025-07-16 19:21:02,538 - INFO - Optimizer step 33: log_σ²=0.487579, weight=0.614111 2025-07-16 19:21:25,905 - INFO - log_σ² gradient: -0.319743 2025-07-16 19:21:25,983 - INFO - Optimizer step 34: log_σ²=0.487600, weight=0.614098 2025-07-16 19:21:47,508 - INFO - log_σ² gradient: -0.308666 2025-07-16 19:21:47,580 - INFO - Optimizer step 35: log_σ²=0.487620, weight=0.614086 2025-07-16 19:22:09,716 - INFO - log_σ² gradient: -0.315249 2025-07-16 19:22:09,794 - INFO - Optimizer step 36: log_σ²=0.487638, weight=0.614075 2025-07-16 19:22:31,648 - INFO - log_σ² gradient: -0.317830 2025-07-16 19:22:31,724 - INFO - Optimizer step 37: log_σ²=0.487656, weight=0.614064 2025-07-16 19:22:53,787 - INFO - log_σ² gradient: -0.312425 2025-07-16 19:22:53,859 - INFO - Optimizer step 38: log_σ²=0.487671, weight=0.614055 2025-07-16 19:23:15,364 - INFO - log_σ² gradient: -0.304591 2025-07-16 19:23:15,444 - INFO - Optimizer step 39: log_σ²=0.487686, weight=0.614046 2025-07-16 19:23:36,581 - INFO - log_σ² gradient: -0.307430 2025-07-16 19:23:36,654 - INFO - Optimizer step 40: log_σ²=0.487699, weight=0.614038 2025-07-16 19:23:59,723 - INFO - log_σ² gradient: -0.309160 2025-07-16 19:23:59,788 - INFO - Optimizer step 41: log_σ²=0.487711, weight=0.614031 2025-07-16 19:24:20,979 - INFO - log_σ² gradient: -0.309993 2025-07-16 19:24:21,062 - INFO - Optimizer step 42: log_σ²=0.487721, weight=0.614024 2025-07-16 19:24:43,845 - INFO - log_σ² gradient: -0.312801 2025-07-16 19:24:43,922 - INFO - Optimizer step 43: log_σ²=0.487730, weight=0.614019 2025-07-16 19:25:07,696 - INFO - log_σ² gradient: -0.314393 2025-07-16 19:25:07,771 - INFO - Optimizer step 44: log_σ²=0.487738, weight=0.614014 2025-07-16 19:25:29,111 - INFO - log_σ² gradient: -0.306530 2025-07-16 19:25:29,186 - INFO - Optimizer step 45: log_σ²=0.487745, weight=0.614010 2025-07-16 19:25:52,509 - INFO - log_σ² gradient: -0.317035 2025-07-16 19:25:52,591 - INFO - Optimizer step 46: log_σ²=0.487750, weight=0.614007 2025-07-16 19:26:14,430 - INFO - log_σ² gradient: -0.313475 2025-07-16 19:26:14,505 - INFO - Optimizer step 47: log_σ²=0.487754, weight=0.614004 2025-07-16 19:26:35,673 - INFO - log_σ² gradient: -0.319567 2025-07-16 19:26:35,746 - INFO - Optimizer step 48: log_σ²=0.487756, weight=0.614002 2025-07-16 19:26:45,937 - INFO - log_σ² gradient: -0.146140 2025-07-16 19:26:46,007 - INFO - Optimizer step 49: log_σ²=0.487758, weight=0.614002 2025-07-16 19:26:46,282 - INFO - Epoch 30: Total optimizer steps: 49 2025-07-16 19:29:45,893 - INFO - Validation metrics: 2025-07-16 19:29:45,893 - INFO - Loss: 0.4082 2025-07-16 19:29:45,893 - INFO - BCE Loss: 0.3146 2025-07-16 19:29:45,893 - INFO - Weighted BCE Loss: 0.1932 2025-07-16 19:29:45,893 - INFO - Average similarity: 0.7799 2025-07-16 19:29:45,893 - INFO - Median similarity: 0.8049 2025-07-16 19:29:45,893 - INFO - Clean sample similarity: 0.7799 2025-07-16 19:29:45,894 - INFO - Corrupted sample similarity: 0.3714 2025-07-16 19:29:45,894 - INFO - Similarity gap (clean - corrupt): 0.4085 2025-07-16 19:29:46,069 - INFO - Epoch 30/30 - Train Loss: 0.4276, Val Loss: 0.4082, Val BCE: 0.3146, Val wBCE: 0.1932, Clean Sim: 0.7799, Corrupt Sim: 0.3714, Gap: 0.4085, Time: 1266.99s 2025-07-16 19:32:34,686 - INFO - Epoch 30 Validation Alignment: Pos=0.231, Neg=0.109, Gap=0.122 2025-07-16 19:32:34,686 - INFO - Training completed! 2025-07-16 19:32:37,373 - INFO - Evaluating best models on test set... 2025-07-16 19:32:38,320 - INFO - Loaded best loss model from epoch 27 2025-07-16 19:35:55,804 - INFO - Test (Best Loss) metrics: 2025-07-16 19:35:55,805 - INFO - Loss: 0.4103 2025-07-16 19:35:55,805 - INFO - BCE Loss: 0.3173 2025-07-16 19:35:55,805 - INFO - Weighted BCE Loss: 0.1948 2025-07-16 19:35:55,805 - INFO - Average similarity: 0.7277 2025-07-16 19:35:55,805 - INFO - Median similarity: 0.7505 2025-07-16 19:35:55,805 - INFO - Clean sample similarity: 0.7277 2025-07-16 19:35:55,805 - INFO - Corrupted sample similarity: 0.3438 2025-07-16 19:35:55,805 - INFO - Similarity gap (clean - corrupt): 0.3839 2025-07-16 19:38:53,903 - INFO - Loaded best gap model from epoch 28 2025-07-16 19:42:13,802 - INFO - Test (Best Gap) metrics: 2025-07-16 19:42:13,803 - INFO - Loss: 0.4171 2025-07-16 19:42:13,803 - INFO - BCE Loss: 0.3176 2025-07-16 19:42:13,803 - INFO - Weighted BCE Loss: 0.1950 2025-07-16 19:42:13,803 - INFO - Average similarity: 0.7656 2025-07-16 19:42:13,803 - INFO - Median similarity: 0.7958 2025-07-16 19:42:13,803 - INFO - Clean sample similarity: 0.7656 2025-07-16 19:42:13,803 - INFO - Corrupted sample similarity: 0.3614 2025-07-16 19:42:13,803 - INFO - Similarity gap (clean - corrupt): 0.4042 2025-07-16 19:45:18,228 - INFO - Evaluation completed! 2025-07-16 19:45:18,228 - INFO - Test results for best_loss_model: 2025-07-16 19:45:18,228 - INFO - Loss: 0.4103 2025-07-16 19:45:18,228 - INFO - Clean Sample Similarity: 0.7277 2025-07-16 19:45:18,228 - INFO - Corrupted Sample Similarity: 0.3438 2025-07-16 19:45:18,228 - INFO - Similarity Gap: 0.3839 2025-07-16 19:45:18,228 - INFO - Test results for best_gap_model: 2025-07-16 19:45:18,228 - INFO - Loss: 0.4171 2025-07-16 19:45:18,228 - INFO - Clean Sample Similarity: 0.7656 2025-07-16 19:45:18,228 - INFO - Corrupted Sample Similarity: 0.3614 2025-07-16 19:45:18,228 - INFO - Similarity Gap: 0.4042 2025-07-16 19:45:18,446 - INFO - All tasks completed!