05/12/2026 04:26:58 - INFO - accelerate.utils.modeling - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk). 05/12/2026 04:28:00 - INFO - root - number of sec samples before upsampling: 1810 05/12/2026 04:28:00 - INFO - root - number of sec samples after upsampling: 2415 05/12/2026 04:28:08 - INFO - root - Training args Namespace(output_name='deepseek-coder-6.7b-lora-safecoder', datasets=['evol', 'sec-desc', 'sec-new-desc'], pretrain_name='deepseek-coder-6.7b', loss_weight=1.0, sven=False, num_train_epochs=2, learning_rate=2e-05, max_num_tokens=1024, batch_size=1, grad_acc_steps=16, weight_decay=0.01, adam_epsilon=1e-08, warmup_steps=0, max_grad_norm=1.0, dropout=0.1, kl_loss_weight=0, exclude_neg=False, no_weights=False, lora=True, r=16, lora_alpha=32, lora_dropout=0.1, sampling_size=20, sampling_method='minority', cwes=['all'], langs=['all'], logging_steps=50, save_epochs=10, seed=2, data_dir='../data_train_val', model_dir='../trained/', output_dir='../trained/deepseek-coder-6.7b-lora-safecoder', logger=) 05/12/2026 04:28:08 - INFO - root - ***** Running training ***** 05/12/2026 04:28:08 - INFO - root - Num samples = 30979 05/12/2026 04:28:08 - INFO - root - Num epoch = 2 05/12/2026 04:28:08 - INFO - root - Batch size= 1 05/12/2026 04:28:08 - INFO - root - Total batch size (w. accumulation) = 16 05/12/2026 04:28:08 - INFO - root - Gradient Accumulation steps = 16 05/12/2026 04:28:08 - INFO - root - Total optimization steps = 3872 05/12/2026 04:28:08 - INFO - root - Num val samples = 3371 05/12/2026 04:28:08 - INFO - root - Num parameters = 6779150688 05/12/2026 04:28:08 - INFO - root - Num trainable parameters = 40554848 05/12/2026 04:32:09 - INFO - root - epochs: 1/2, steps: 50/3872, func: 0.051139, pos: 0.0577, neg: 0.181555, 1%: 5h 8m 19s 05/12/2026 04:36:12 - INFO - root - epochs: 1/2, steps: 100/3872, func: 0.048759, pos: 0.062282, neg: 0.140178, 2%: 5h 4m 23s 05/12/2026 04:40:09 - INFO - root - epochs: 1/2, steps: 150/3872, func: 0.047331, pos: 0.077192, neg: 0.118673, 3%: 4h 58m 31s 05/12/2026 04:44:10 - INFO - root - epochs: 1/2, steps: 200/3872, func: 0.047128, pos: 0.066722, neg: 0.09821, 5%: 4h 54m 36s 05/12/2026 04:48:07 - INFO - root - epochs: 1/2, steps: 250/3872, func: 0.048349, pos: 0.090577, neg: 0.056817, 6%: 4h 49m 36s 05/12/2026 04:52:04 - INFO - root - epochs: 1/2, steps: 300/3872, func: 0.046514, pos: 0.078, neg: 0.04461, 7%: 4h 45m 13s 05/12/2026 04:56:03 - INFO - root - epochs: 1/2, steps: 350/3872, func: 0.048205, pos: 0.065144, neg: 0.056317, 9%: 4h 41m 5s 05/12/2026 05:00:05 - INFO - root - epochs: 1/2, steps: 400/3872, func: 0.046193, pos: 0.075261, neg: 0.040788, 10%: 4h 37m 26s 05/12/2026 05:04:07 - INFO - root - epochs: 1/2, steps: 450/3872, func: 0.045684, pos: 0.077037, neg: 0.050153, 11%: 4h 33m 47s 05/12/2026 05:08:11 - INFO - root - epochs: 1/2, steps: 500/3872, func: 0.045614, pos: 0.072255, neg: 0.053337, 12%: 4h 30m 17s 05/12/2026 05:12:10 - INFO - root - epochs: 1/2, steps: 550/3872, func: 0.046064, pos: 0.060699, neg: 0.041991, 14%: 4h 26m 7s 05/12/2026 05:16:15 - INFO - root - epochs: 1/2, steps: 600/3872, func: 0.045751, pos: 0.081914, neg: 0.042419, 15%: 4h 22m 25s 05/12/2026 05:20:14 - INFO - root - epochs: 1/2, steps: 650/3872, func: 0.044901, pos: 0.06346, neg: 0.040288, 16%: 4h 18m 24s 05/12/2026 05:24:22 - INFO - root - epochs: 1/2, steps: 700/3872, func: 0.046995, pos: 0.089755, neg: 0.028394, 18%: 4h 14m 57s 05/12/2026 05:28:23 - INFO - root - epochs: 1/2, steps: 750/3872, func: 0.046683, pos: 0.067772, neg: 0.032075, 19%: 4h 10m 52s 05/12/2026 05:32:19 - INFO - root - epochs: 1/2, steps: 800/3872, func: 0.046182, pos: 0.066598, neg: 0.032235, 20%: 4h 6m 31s 05/12/2026 05:36:21 - INFO - root - epochs: 1/2, steps: 850/3872, func: 0.045583, pos: 0.068573, neg: 0.031431, 21%: 4h 2m 38s 05/12/2026 05:40:20 - INFO - root - epochs: 1/2, steps: 900/3872, func: 0.045578, pos: 0.08372, neg: 0.030609, 23%: 3h 58m 33s 05/12/2026 05:44:20 - INFO - root - epochs: 1/2, steps: 950/3872, func: 0.045548, pos: 0.060191, neg: 0.033258, 24%: 3h 54m 28s 05/12/2026 05:48:23 - INFO - root - epochs: 1/2, steps: 1000/3872, func: 0.045722, pos: 0.052546, neg: 0.031617, 25%: 3h 50m 35s 05/12/2026 05:52:24 - INFO - root - epochs: 1/2, steps: 1050/3872, func: 0.045856, pos: 0.071707, neg: 0.02853, 27%: 3h 46m 33s 05/12/2026 05:56:28 - INFO - root - epochs: 1/2, steps: 1100/3872, func: 0.046406, pos: 0.07153, neg: 0.033045, 28%: 3h 42m 41s 05/12/2026 06:00:33 - INFO - root - epochs: 1/2, steps: 1150/3872, func: 0.045131, pos: 0.050152, neg: 0.035439, 29%: 3h 38m 49s 05/12/2026 06:04:33 - INFO - root - epochs: 1/2, steps: 1200/3872, func: 0.044958, pos: 0.056376, neg: 0.03231, 30%: 3h 34m 47s 05/12/2026 06:08:35 - INFO - root - epochs: 1/2, steps: 1250/3872, func: 0.045854, pos: 0.069139, neg: 0.031689, 32%: 3h 30m 46s 05/12/2026 06:12:35 - INFO - root - epochs: 1/2, steps: 1300/3872, func: 0.045146, pos: 0.059226, neg: 0.032273, 33%: 3h 26m 45s 05/12/2026 06:16:28 - INFO - root - epochs: 1/2, steps: 1350/3872, func: 0.044379, pos: 0.064239, neg: 0.033579, 34%: 3h 22m 28s 05/12/2026 06:20:35 - INFO - root - epochs: 1/2, steps: 1400/3872, func: 0.045807, pos: 0.054144, neg: 0.032231, 36%: 3h 18m 39s 05/12/2026 06:24:37 - INFO - root - epochs: 1/2, steps: 1450/3872, func: 0.045381, pos: 0.050893, neg: 0.030635, 37%: 3h 14m 39s 05/12/2026 06:28:45 - INFO - root - epochs: 1/2, steps: 1500/3872, func: 0.045888, pos: 0.062029, neg: 0.028009, 38%: 3h 10m 48s 05/12/2026 06:32:47 - INFO - root - epochs: 1/2, steps: 1550/3872, func: 0.045597, pos: 0.04823, neg: 0.033559, 40%: 3h 6m 49s 05/12/2026 06:36:47 - INFO - root - epochs: 1/2, steps: 1600/3872, func: 0.046528, pos: 0.060938, neg: 0.023728, 41%: 3h 2m 46s 05/12/2026 06:44:55 - INFO - root - epochs: 1/2, steps: 1700/3872, func: 0.046265, pos: 0.066801, neg: 0.025508, 43%: 2h 54m 52s 05/12/2026 06:44:55 - INFO - root - epochs: 1/2, steps: 1700/3872, func: 0.046265, pos: 0.066801, neg: 0.025508, 43%: 2h 54m 52s 05/12/2026 06:49:03 - INFO - root - epochs: 1/2, steps: 1750/3872, func: 0.04581, pos: 0.061126, neg: 0.021841, 45%: 2h 50m 57s 05/12/2026 06:53:12 - INFO - root - epochs: 1/2, steps: 1800/3872, func: 0.04477, pos: 0.065656, neg: 0.025342, 46%: 2h 47m 4s 05/12/2026 06:57:10 - INFO - root - epochs: 1/2, steps: 1850/3872, func: 0.044691, pos: 0.049599, neg: 0.024145, 47%: 2h 42m 59s 05/12/2026 07:01:05 - INFO - root - epochs: 1/2, steps: 1900/3872, func: 0.0449, pos: 0.04835, neg: 0.024969, 49%: 2h 38m 50s 05/12/2026 07:05:05 - INFO - root - epochs: 2/2, steps: 1950/3872, func: 0.044635, pos: 0.056919, neg: 0.026455, 50%: 2h 34m 47s 05/12/2026 07:09:09 - INFO - root - epochs: 2/2, steps: 2000/3872, func: 0.044986, pos: 0.048561, neg: 0.013954, 51%: 2h 30m 47s 05/12/2026 07:13:09 - INFO - root - epochs: 2/2, steps: 2050/3872, func: 0.045324, pos: 0.049473, neg: 0.0231, 52%: 2h 26m 45s 05/12/2026 07:17:15 - INFO - root - epochs: 2/2, steps: 2100/3872, func: 0.043874, pos: 0.0474, neg: 0.018272, 54%: 2h 22m 48s 05/12/2026 07:21:14 - INFO - root - epochs: 2/2, steps: 2150/3872, func: 0.04517, pos: 0.052697, neg: 0.022861, 55%: 2h 18m 43s 05/12/2026 07:25:16 - INFO - root - epochs: 2/2, steps: 2200/3872, func: 0.045326, pos: 0.05154, neg: 0.02561, 56%: 2h 14m 41s 05/12/2026 07:29:10 - INFO - root - epochs: 2/2, steps: 2250/3872, func: 0.044066, pos: 0.044166, neg: 0.013894, 58%: 2h 10m 35s 05/12/2026 07:33:11 - INFO - root - epochs: 2/2, steps: 2300/3872, func: 0.044331, pos: 0.032139, neg: 0.013464, 59%: 2h 6m 34s 05/12/2026 07:37:14 - INFO - root - epochs: 2/2, steps: 2350/3872, func: 0.044916, pos: 0.036937, neg: 0.023725, 60%: 2h 2m 33s 05/12/2026 07:41:11 - INFO - root - epochs: 2/2, steps: 2400/3872, func: 0.044357, pos: 0.044459, neg: 0.017898, 61%: 1h 58m 28s 05/12/2026 07:45:18 - INFO - root - epochs: 2/2, steps: 2450/3872, func: 0.044135, pos: 0.05079, neg: 0.025111, 63%: 1h 54m 31s 05/12/2026 07:49:22 - INFO - root - epochs: 2/2, steps: 2500/3872, func: 0.044937, pos: 0.056568, neg: 0.022755, 64%: 1h 50m 31s 05/12/2026 07:53:21 - INFO - root - epochs: 2/2, steps: 2550/3872, func: 0.044683, pos: 0.041534, neg: 0.019721, 65%: 1h 46m 28s 05/12/2026 07:57:21 - INFO - root - epochs: 2/2, steps: 2600/3872, func: 0.043585, pos: 0.039235, neg: 0.018281, 67%: 1h 42m 26s 05/12/2026 08:01:19 - INFO - root - epochs: 2/2, steps: 2650/3872, func: 0.044604, pos: 0.030112, neg: 0.020019, 68%: 1h 38m 23s 05/12/2026 08:05:20 - INFO - root - epochs: 2/2, steps: 2700/3872, func: 0.044264, pos: 0.058356, neg: 0.019272, 69%: 1h 34m 21s 05/12/2026 08:09:25 - INFO - root - epochs: 2/2, steps: 2750/3872, func: 0.044934, pos: 0.0307, neg: 0.014201, 70%: 1h 30m 22s 05/12/2026 08:13:28 - INFO - root - epochs: 2/2, steps: 2800/3872, func: 0.045121, pos: 0.043066, neg: 0.015057, 72%: 1h 26m 21s 05/12/2026 08:17:28 - INFO - root - epochs: 2/2, steps: 2850/3872, func: 0.043984, pos: 0.051797, neg: 0.019197, 73%: 1h 22m 19s 05/12/2026 08:21:27 - INFO - root - epochs: 2/2, steps: 2900/3872, func: 0.044244, pos: 0.040063, neg: 0.020778, 74%: 1h 18m 17s 05/12/2026 08:25:30 - INFO - root - epochs: 2/2, steps: 2950/3872, func: 0.044645, pos: 0.046563, neg: 0.020736, 76%: 1h 14m 15s 05/12/2026 08:29:26 - INFO - root - epochs: 2/2, steps: 3000/3872, func: 0.044216, pos: 0.0446, neg: 0.01489, 77%: 1h 10m 13s 05/12/2026 08:33:21 - INFO - root - epochs: 2/2, steps: 3050/3872, func: 0.043855, pos: 0.053021, neg: 0.022313, 78%: 1h 6m 10s 05/12/2026 08:37:20 - INFO - root - epochs: 2/2, steps: 3100/3872, func: 0.044729, pos: 0.052757, neg: 0.014513, 80%: 1h 2m 8s 05/12/2026 08:41:20 - INFO - root - epochs: 2/2, steps: 3150/3872, func: 0.045399, pos: 0.045577, neg: 0.017218, 81%: 0h 58m 7s 05/12/2026 08:45:17 - INFO - root - epochs: 2/2, steps: 3200/3872, func: 0.044257, pos: 0.045171, neg: 0.011818, 82%: 0h 54m 5s 05/12/2026 08:49:12 - INFO - root - epochs: 2/2, steps: 3250/3872, func: 0.045193, pos: 0.040208, neg: 0.018835, 83%: 0h 50m 2s 05/12/2026 08:53:13 - INFO - root - epochs: 2/2, steps: 3300/3872, func: 0.045501, pos: 0.05843, neg: 0.018807, 85%: 0h 46m 1s 05/12/2026 08:57:13 - INFO - root - epochs: 2/2, steps: 3350/3872, func: 0.043979, pos: 0.03747, neg: 0.015223, 86%: 0h 42m 0s 05/12/2026 09:01:09 - INFO - root - epochs: 2/2, steps: 3400/3872, func: 0.043928, pos: 0.029227, neg: 0.016011, 87%: 0h 37m 58s 05/12/2026 09:05:06 - INFO - root - epochs: 2/2, steps: 3450/3872, func: 0.044516, pos: 0.046607, neg: 0.019672, 89%: 0h 33m 57s 05/12/2026 09:09:10 - INFO - root - epochs: 2/2, steps: 3500/3872, func: 0.043855, pos: 0.040829, neg: 0.015805, 90%: 0h 29m 57s 05/12/2026 09:13:14 - INFO - root - epochs: 2/2, steps: 3550/3872, func: 0.044373, pos: 0.032478, neg: 0.020333, 91%: 0h 25m 56s 05/12/2026 09:17:09 - INFO - root - epochs: 2/2, steps: 3600/3872, func: 0.044835, pos: 0.03874, neg: 0.023242, 92%: 0h 21m 55s 05/12/2026 09:21:06 - INFO - root - epochs: 2/2, steps: 3650/3872, func: 0.043907, pos: 0.038848, neg: 0.016644, 94%: 0h 17m 53s 05/12/2026 09:25:04 - INFO - root - epochs: 2/2, steps: 3700/3872, func: 0.045019, pos: 0.028633, neg: 0.012057, 95%: 0h 13m 53s 05/12/2026 09:29:02 - INFO - root - epochs: 2/2, steps: 3750/3872, func: 0.044583, pos: 0.051069, neg: 0.021792, 96%: 0h 9m 52s 05/12/2026 09:33:03 - INFO - root - epochs: 2/2, steps: 3800/3872, func: 0.045005, pos: 0.051063, neg: 0.025109, 98%: 0h 5m 51s 05/12/2026 09:37:01 - INFO - root - epochs: 2/2, steps: 3850/3872, func: 0.044776, pos: 0.039668, neg: 0.019964, 99%: 0h 1m 50s 05/12/2026 09:47:01 - INFO - root - final eval loss: func: 0.044795, pos: 0.056991, neg: 0.026537 05/12/2026 09:47:01 - INFO - root - Saving model checkpoint to ../trained/deepseek-coder-6.7b-lora-safecoder/checkpoint-last