| 2026-04-09 05:58:41,845 [INFO] ============================================================ |
| 2026-04-09 05:58:41,845 [INFO] Model: allenai/OLMo-2-0425-1B (olmo2) |
| 2026-04-09 05:58:41,845 [INFO] Task: math | Dataset: Onlydrinkwater/int_addition2 |
| 2026-04-09 05:58:41,845 [INFO] Epochs: 5 | LR: 5e-05 | Batch: 16x1 |
| 2026-04-09 05:58:41,845 [INFO] Steps: 1565 | Warmup: 156 |
| 2026-04-09 05:58:41,846 [INFO] Max length: 128 | Train samples: 5000 |
| 2026-04-09 05:58:41,846 [INFO] ============================================================ |
| 2026-04-09 05:59:14,693 [INFO] Epoch 1 | train_loss=3.8386 |
| 2026-04-09 05:59:16,071 [INFO] Epoch 1 | test_loss=3.0317 |
| 2026-04-09 05:59:48,021 [INFO] Epoch 2 | train_loss=2.4995 |
| 2026-04-09 05:59:49,382 [INFO] Epoch 2 | test_loss=2.5025 |
| 2026-04-09 06:00:21,191 [INFO] Epoch 3 | train_loss=1.6778 |
| 2026-04-09 06:00:22,563 [INFO] Epoch 3 | test_loss=2.2169 |
| 2026-04-09 06:00:54,469 [INFO] Epoch 4 | train_loss=1.2231 |
| 2026-04-09 06:00:55,827 [INFO] Epoch 4 | test_loss=2.2163 |
| 2026-04-09 06:01:27,626 [INFO] Epoch 5 | train_loss=1.1009 |
| 2026-04-09 06:01:28,982 [INFO] Epoch 5 | test_loss=2.2303 |
| 2026-04-09 06:01:29,157 [INFO] gcc -pthread -B /work1/yizhanh/miniconda3/compiler_compat -fno-strict-overflow -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /work1/yizhanh/miniconda3/include -fPIC -O2 -isystem /work1/yizhanh/miniconda3/include -fPIC -c /work8/yizhanh/tmp/tmp0lyhrkci/test.c -o /work8/yizhanh/tmp/tmp0lyhrkci/test.o |
| 2026-04-09 06:01:29,180 [INFO] gcc -pthread -B /work1/yizhanh/miniconda3/compiler_compat /work8/yizhanh/tmp/tmp0lyhrkci/test.o -laio -o /work8/yizhanh/tmp/tmp0lyhrkci/a.out |
| 2026-04-09 06:01:35,290 [INFO] ============================================================ |
| 2026-04-09 06:01:35,290 [INFO] Training complete in 2.8 min |
| 2026-04-09 06:01:35,290 [INFO] Model saved to OLMo/saved_models/olmo2_addition/final |
| 2026-04-09 06:01:35,291 [INFO] ============================================================ |
|
|