| 05/15/2026 17:53:01 - INFO - accelerate.utils.modeling - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk). |
| 05/15/2026 17:53:34 - INFO - root - Training args Namespace(output_name='codellama-7b-lora-codellama-7b_std', datasets=['evol'], pretrain_name='codellama-7b', loss_weight=1.0, sven=False, num_train_epochs=2, learning_rate=2e-05, max_num_tokens=1024, batch_size=1, grad_acc_steps=16, weight_decay=0.01, adam_epsilon=1e-08, warmup_steps=0, max_grad_norm=1.0, dropout=0.1, kl_loss_weight=0, exclude_neg=False, no_weights=False, lora=True, r=16, lora_alpha=32, lora_dropout=0.1, sampling_size=20, sampling_method='minority', cwes=['all'], langs=['all'], logging_steps=50, save_epochs=10, seed=2, data_dir='/mnt/scratch/QRM/experiments/SCoDE/data_train_val', model_dir='/mnt/scratch/QRM/experiments/SCoDE/trained', output_dir='/mnt/scratch/QRM/experiments/SCoDE/trained/codellama-7b-lora-codellama-7b_std', logger=<RootLogger root (INFO)>) |
| 05/15/2026 17:53:34 - INFO - root - ***** Running training ***** |
| 05/15/2026 17:53:34 - INFO - root - Num samples = 28298 |
| 05/15/2026 17:53:34 - INFO - root - Num epoch = 2 |
| 05/15/2026 17:53:34 - INFO - root - Batch size= 1 |
| 05/15/2026 17:53:34 - INFO - root - Total batch size (w. accumulation) = 16 |
| 05/15/2026 17:53:34 - INFO - root - Gradient Accumulation steps = 16 |
| 05/15/2026 17:53:34 - INFO - root - Total optimization steps = 3536 |
| 05/15/2026 17:53:34 - INFO - root - Num val samples = 3143 |
| 05/15/2026 17:53:34 - INFO - root - Num parameters = 6779101440 |
| 05/15/2026 17:53:34 - INFO - root - Num trainable parameters = 40554752 |
| 05/15/2026 17:56:20 - INFO - root - epochs: 1/2, steps: 50/3536, func: 0.053553, 1%: 3h 12m 37s |
| 05/15/2026 17:59:04 - INFO - root - epochs: 1/2, steps: 100/3536, func: 0.0489, 2%: 3h 9m 4s |
| 05/15/2026 18:01:47 - INFO - root - epochs: 1/2, steps: 150/3536, func: 0.049055, 4%: 3h 5m 26s |
| 05/15/2026 18:04:30 - INFO - root - epochs: 1/2, steps: 200/3536, func: 0.047903, 5%: 3h 2m 19s |
| 05/15/2026 18:07:14 - INFO - root - epochs: 1/2, steps: 250/3536, func: 0.0482, 7%: 2h 59m 39s |
| 05/15/2026 18:09:56 - INFO - root - epochs: 1/2, steps: 300/3536, func: 0.04778, 8%: 2h 56m 38s |
| 05/15/2026 18:12:40 - INFO - root - epochs: 1/2, steps: 350/3536, func: 0.047599, 9%: 2h 53m 53s |
| 05/15/2026 18:15:23 - INFO - root - epochs: 1/2, steps: 400/3536, func: 0.047311, 11%: 2h 51m 4s |
| 05/15/2026 18:18:07 - INFO - root - epochs: 1/2, steps: 450/3536, func: 0.047141, 12%: 2h 48m 20s |
| 05/15/2026 18:20:50 - INFO - root - epochs: 1/2, steps: 500/3536, func: 0.046227, 14%: 2h 45m 36s |
| 05/15/2026 18:23:32 - INFO - root - epochs: 1/2, steps: 550/3536, func: 0.047724, 15%: 2h 42m 41s |
| 05/15/2026 18:26:14 - INFO - root - epochs: 1/2, steps: 600/3536, func: 0.046239, 16%: 2h 39m 52s |
| 05/15/2026 18:28:56 - INFO - root - epochs: 1/2, steps: 650/3536, func: 0.046872, 18%: 2h 37m 4s |
| 05/15/2026 18:31:38 - INFO - root - epochs: 1/2, steps: 700/3536, func: 0.046514, 19%: 2h 34m 16s |
| 05/15/2026 18:34:21 - INFO - root - epochs: 1/2, steps: 750/3536, func: 0.046605, 21%: 2h 31m 34s |
| 05/15/2026 18:37:04 - INFO - root - epochs: 1/2, steps: 800/3536, func: 0.046667, 22%: 2h 28m 48s |
| 05/15/2026 18:39:46 - INFO - root - epochs: 1/2, steps: 850/3536, func: 0.046913, 24%: 2h 26m 1s |
| 05/15/2026 18:42:28 - INFO - root - epochs: 1/2, steps: 900/3536, func: 0.046443, 25%: 2h 23m 16s |
| 05/15/2026 18:45:10 - INFO - root - epochs: 1/2, steps: 950/3536, func: 0.046632, 26%: 2h 20m 29s |
| 05/15/2026 18:47:52 - INFO - root - epochs: 1/2, steps: 1000/3536, func: 0.046355, 28%: 2h 17m 45s |
| 05/15/2026 18:50:35 - INFO - root - epochs: 1/2, steps: 1050/3536, func: 0.04673, 29%: 2h 15m 2s |
| 05/15/2026 18:53:17 - INFO - root - epochs: 1/2, steps: 1100/3536, func: 0.045299, 31%: 2h 12m 16s |
| 05/15/2026 18:56:00 - INFO - root - epochs: 1/2, steps: 1150/3536, func: 0.046432, 32%: 2h 9m 34s |
| 05/15/2026 18:58:42 - INFO - root - epochs: 1/2, steps: 1200/3536, func: 0.045988, 33%: 2h 6m 50s |
| 05/15/2026 19:01:24 - INFO - root - epochs: 1/2, steps: 1250/3536, func: 0.047306, 35%: 2h 4m 5s |
| 05/15/2026 19:04:07 - INFO - root - epochs: 1/2, steps: 1300/3536, func: 0.047335, 36%: 2h 1m 23s |
| 05/15/2026 19:06:49 - INFO - root - epochs: 1/2, steps: 1350/3536, func: 0.046585, 38%: 1h 58m 40s |
| 05/15/2026 19:09:32 - INFO - root - epochs: 1/2, steps: 1400/3536, func: 0.045611, 39%: 1h 55m 56s |
| 05/15/2026 19:12:11 - INFO - root - epochs: 1/2, steps: 1450/3536, func: 0.046602, 40%: 1h 53m 9s |
| 05/15/2026 19:14:42 - INFO - root - epochs: 1/2, steps: 1500/3536, func: 0.045796, 42%: 1h 50m 10s |
| 05/15/2026 19:17:14 - INFO - root - epochs: 1/2, steps: 1550/3536, func: 0.045634, 43%: 1h 47m 15s |
| 05/15/2026 19:19:46 - INFO - root - epochs: 1/2, steps: 1600/3536, func: 0.046124, 45%: 1h 44m 20s |
| 05/15/2026 19:22:16 - INFO - root - epochs: 1/2, steps: 1650/3536, func: 0.045239, 46%: 1h 41m 26s |
| 05/15/2026 19:24:47 - INFO - root - epochs: 1/2, steps: 1700/3536, func: 0.046727, 48%: 1h 38m 33s |
| 05/15/2026 19:27:17 - INFO - root - epochs: 1/2, steps: 1750/3536, func: 0.046448, 49%: 1h 35m 41s |
| 05/15/2026 19:29:51 - INFO - root - epochs: 2/2, steps: 1800/3536, func: 0.045678, 50%: 1h 32m 54s |
| 05/15/2026 19:32:22 - INFO - root - epochs: 2/2, steps: 1850/3536, func: 0.045587, 52%: 1h 30m 5s |
| 05/15/2026 19:34:53 - INFO - root - epochs: 2/2, steps: 1900/3536, func: 0.04649, 53%: 1h 27m 17s |
| 05/15/2026 19:37:24 - INFO - root - epochs: 2/2, steps: 1950/3536, func: 0.046279, 55%: 1h 24m 30s |
| 05/15/2026 19:39:55 - INFO - root - epochs: 2/2, steps: 2000/3536, func: 0.045697, 56%: 1h 21m 43s |
| 05/15/2026 19:42:27 - INFO - root - epochs: 2/2, steps: 2050/3536, func: 0.045146, 57%: 1h 18m 58s |
| 05/15/2026 19:44:58 - INFO - root - epochs: 2/2, steps: 2100/3536, func: 0.046819, 59%: 1h 16m 13s |
| 05/15/2026 19:47:28 - INFO - root - epochs: 2/2, steps: 2150/3536, func: 0.04625, 60%: 1h 13m 28s |
| 05/15/2026 19:49:59 - INFO - root - epochs: 2/2, steps: 2200/3536, func: 0.046144, 62%: 1h 10m 45s |
| 05/15/2026 19:52:31 - INFO - root - epochs: 2/2, steps: 2250/3536, func: 0.046049, 63%: 1h 8m 2s |
| 05/15/2026 19:55:01 - INFO - root - epochs: 2/2, steps: 2300/3536, func: 0.046191, 65%: 1h 5m 19s |
| 05/15/2026 19:57:32 - INFO - root - epochs: 2/2, steps: 2350/3536, func: 0.04618, 66%: 1h 2m 37s |
| 05/15/2026 20:00:04 - INFO - root - epochs: 2/2, steps: 2400/3536, func: 0.045209, 67%: 0h 59m 55s |
| 05/15/2026 20:02:36 - INFO - root - epochs: 2/2, steps: 2450/3536, func: 0.045665, 69%: 0h 57m 14s |
| 05/15/2026 20:05:08 - INFO - root - epochs: 2/2, steps: 2500/3536, func: 0.046733, 70%: 0h 54m 34s |
| 05/15/2026 20:07:39 - INFO - root - epochs: 2/2, steps: 2550/3536, func: 0.044726, 72%: 0h 51m 53s |
| 05/15/2026 20:10:11 - INFO - root - epochs: 2/2, steps: 2600/3536, func: 0.04645, 73%: 0h 49m 13s |
| 05/15/2026 20:12:42 - INFO - root - epochs: 2/2, steps: 2650/3536, func: 0.045592, 74%: 0h 46m 34s |
| 05/15/2026 20:15:13 - INFO - root - epochs: 2/2, steps: 2700/3536, func: 0.044934, 76%: 0h 43m 54s |
| 05/15/2026 20:17:44 - INFO - root - epochs: 2/2, steps: 2750/3536, func: 0.046683, 77%: 0h 41m 15s |
| 05/15/2026 20:20:16 - INFO - root - epochs: 2/2, steps: 2800/3536, func: 0.045118, 79%: 0h 38m 36s |
| 05/15/2026 20:22:46 - INFO - root - epochs: 2/2, steps: 2850/3536, func: 0.045185, 80%: 0h 35m 57s |
| 05/15/2026 20:25:19 - INFO - root - epochs: 2/2, steps: 2900/3536, func: 0.04674, 81%: 0h 33m 19s |
| 05/15/2026 20:27:49 - INFO - root - epochs: 2/2, steps: 2950/3536, func: 0.046423, 83%: 0h 30m 41s |
| 05/15/2026 20:30:20 - INFO - root - epochs: 2/2, steps: 3000/3536, func: 0.047098, 84%: 0h 28m 3s |
| 05/15/2026 20:32:52 - INFO - root - epochs: 2/2, steps: 3050/3536, func: 0.045522, 86%: 0h 25m 26s |
| 05/15/2026 20:35:24 - INFO - root - epochs: 2/2, steps: 3100/3536, func: 0.045772, 87%: 0h 22m 48s |
| 05/15/2026 20:37:55 - INFO - root - epochs: 2/2, steps: 3150/3536, func: 0.045837, 89%: 0h 20m 11s |
| 05/15/2026 20:40:26 - INFO - root - epochs: 2/2, steps: 3200/3536, func: 0.045333, 90%: 0h 17m 34s |
| 05/15/2026 20:42:56 - INFO - root - epochs: 2/2, steps: 3250/3536, func: 0.04512, 91%: 0h 14m 57s |
| 05/15/2026 20:45:28 - INFO - root - epochs: 2/2, steps: 3300/3536, func: 0.045386, 93%: 0h 12m 20s |
| 05/15/2026 20:47:59 - INFO - root - epochs: 2/2, steps: 3350/3536, func: 0.046192, 94%: 0h 9m 44s |
| 05/15/2026 20:50:31 - INFO - root - epochs: 2/2, steps: 3400/3536, func: 0.045377, 96%: 0h 7m 7s |
| 05/15/2026 20:53:01 - INFO - root - epochs: 2/2, steps: 3450/3536, func: 0.044902, 97%: 0h 4m 31s |
| 05/15/2026 20:55:31 - INFO - root - epochs: 2/2, steps: 3500/3536, func: 0.045177, 98%: 0h 1m 55s |
| 05/15/2026 21:00:50 - INFO - root - final eval loss: func: 0.046106 |
| 05/15/2026 21:00:50 - INFO - root - Saving model checkpoint to /mnt/scratch/QRM/experiments/SCoDE/trained/codellama-7b-lora-codellama-7b_std/checkpoint-last |
|
|