| 2026-06-18 03:41:28,071 | T4/P100 (Tesla T4 sm_75): Flash unavailable, using chunked attn |
| 2026-06-18 03:41:28,257 | HTTP Request: GET https://huggingface.co/api/agent-harnesses "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:28,300 | HTTP Request: GET https://huggingface.co/api/whoami-v2 "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:28,351 | HTTP Request: HEAD https://huggingface.co/GODELEV/TOK-8K/resolve/main/config.json "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:28,397 | HTTP Request: HEAD https://huggingface.co/GODELEV/TOK-8K/resolve/main/config.json "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:28,445 | HTTP Request: HEAD https://huggingface.co/GODELEV/TOK-8K/resolve/main/tokenizer_config.json "HTTP/1.1 307 Temporary Redirect" |
| 2026-06-18 03:41:28,490 | HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/GODELEV/TOK-8K/c2e8c2d4f1641e1f0292f732e5f1875b6b9aa5dc/tokenizer_config.json "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:28,524 | HTTP Request: GET https://huggingface.co/api/resolve-cache/models/GODELEV/TOK-8K/c2e8c2d4f1641e1f0292f732e5f1875b6b9aa5dc/tokenizer_config.json "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:28,777 | HTTP Request: GET https://huggingface.co/api/models/GODELEV/TOK-8K/tree/main/additional_chat_templates?recursive=false&expand=false "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:28,827 | HTTP Request: GET https://huggingface.co/api/models/GODELEV/TOK-8K/tree/main?recursive=true&expand=false "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:28,873 | HTTP Request: HEAD https://huggingface.co/GODELEV/TOK-8K/resolve/main/tokenizer.json "HTTP/1.1 307 Temporary Redirect" |
| 2026-06-18 03:41:28,915 | HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/GODELEV/TOK-8K/c2e8c2d4f1641e1f0292f732e5f1875b6b9aa5dc/tokenizer.json "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:28,951 | HTTP Request: GET https://huggingface.co/api/resolve-cache/models/GODELEV/TOK-8K/c2e8c2d4f1641e1f0292f732e5f1875b6b9aa5dc/tokenizer.json "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:29,046 | HTTP Request: HEAD https://huggingface.co/GODELEV/TOK-8K/resolve/main/tokenizer.model "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:29,129 | HTTP Request: HEAD https://huggingface.co/GODELEV/TOK-8K/resolve/main/added_tokens.json "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:29,173 | HTTP Request: HEAD https://huggingface.co/GODELEV/TOK-8K/resolve/main/special_tokens_map.json "HTTP/1.1 307 Temporary Redirect" |
| 2026-06-18 03:41:29,204 | HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/GODELEV/TOK-8K/c2e8c2d4f1641e1f0292f732e5f1875b6b9aa5dc/special_tokens_map.json "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:29,238 | HTTP Request: GET https://huggingface.co/api/resolve-cache/models/GODELEV/TOK-8K/c2e8c2d4f1641e1f0292f732e5f1875b6b9aa5dc/special_tokens_map.json "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:29,298 | HTTP Request: HEAD https://huggingface.co/GODELEV/TOK-8K/resolve/main/chat_template.jinja "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:29,428 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/main/README.md "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:29,543 | HTTP Request: GET https://huggingface.co/api/datasets/GODELEV/D1-8-Lite "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:29,608 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/D1-8-Lite.py "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:29,693 | HTTP Request: HEAD https://s3.amazonaws.com/datasets.huggingface.co/datasets/datasets/GODELEV/D1-8-Lite/GODELEV/D1-8-Lite.py "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:29,735 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/README.md "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:29,776 | HTTP Request: GET https://huggingface.co/api/datasets/GODELEV/D1-8-Lite/revision/f12e00e7ed44fcf09dd0bedf10392c8440d74699 "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:29,826 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/.huggingface.yaml "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:29,954 | HTTP Request: GET https://datasets-server.huggingface.co/info?dataset=GODELEV/D1-8-Lite "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:30,103 | HTTP Request: GET https://huggingface.co/api/datasets/GODELEV/D1-8-Lite/tree/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data?recursive=true&expand=false "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:30,161 | HTTP Request: GET https://huggingface.co/api/datasets/GODELEV/D1-8-Lite/tree/f12e00e7ed44fcf09dd0bedf10392c8440d74699?recursive=false&expand=false "HTTP/1.1 200 OK" |
| 2026-06-18 03:41:30,210 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/dataset_infos.json "HTTP/1.1 404 Not Found" |
| 2026-06-18 03:41:30,518 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00000-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:41:35,732 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00001-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:41:39,903 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00002-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:41:46,943 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00003-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:41:51,521 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00004-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:41:55,684 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00005-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:00,661 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00006-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:04,575 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00007-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:12,324 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00008-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:17,459 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00009-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:21,182 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00010-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:26,655 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00011-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:30,817 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00012-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:36,392 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00013-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:40,587 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/train-00014-of-00015.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:42:44,218 | HTTP Request: HEAD https://huggingface.co/datasets/GODELEV/D1-8-Lite/resolve/f12e00e7ed44fcf09dd0bedf10392c8440d74699/data/val-00000-of-00001.parquet "HTTP/1.1 302 Found" |
| 2026-06-18 03:44:55,892 | Data loaded train=5,901,696 val=20,000 steps_per_epoch=5,764 total_epochs=1.00 |
| 2026-06-18 03:44:58,643 | Model 9.853M params | float32 | amp=torch.float16 | compiled=True | attn=chunked |
| 2026-06-18 03:44:58,807 | HTTP Request: HEAD https://huggingface.co/GODELEV/Exp-1/resolve/main/resume/latest_step.txt "HTTP/1.1 307 Temporary Redirect" |
| 2026-06-18 03:44:58,854 | HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/GODELEV/Exp-1/2ffdf994730045f6596d2a447229a7f782675f09/resume%2Flatest_step.txt "HTTP/1.1 200 OK" |
| 2026-06-18 03:44:58,905 | HTTP Request: GET https://huggingface.co/api/resolve-cache/models/GODELEV/Exp-1/2ffdf994730045f6596d2a447229a7f782675f09/resume%2Flatest_step.txt "HTTP/1.1 200 OK" |
| 2026-06-18 03:44:58,964 | HTTP Request: HEAD https://huggingface.co/GODELEV/Exp-1/resolve/main/resume/ckpt.pt "HTTP/1.1 302 Found" |
| 2026-06-18 03:45:01,840 | Resumed step=4000 tokens=4190208000 samples=4096000 |
| 2026-06-18 03:51:59,310 | step= 4020 | epoch=0.70 | loss=2.1485 | ppl=8.57 | lr=9.22e-05 | grad=0.781 | tok/s=58,599 | ETA=8:39:36 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 03:57:55,497 | step= 4040 | epoch=0.70 | loss=2.1797 | ppl=8.84 | lr=9.09e-05 | grad=0.719 | tok/s=59,028 | ETA=8:29:55 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:03:50,733 | step= 4060 | epoch=0.70 | loss=2.1667 | ppl=8.73 | lr=8.96e-05 | grad=0.742 | tok/s=59,006 | ETA=8:24:11 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:09:46,109 | step= 4080 | epoch=0.71 | loss=2.1800 | ppl=8.85 | lr=8.84e-05 | grad=0.697 | tok/s=58,909 | ETA=8:19:05 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:15:41,446 | step= 4100 | epoch=0.71 | loss=2.1828 | ppl=8.87 | lr=8.71e-05 | grad=0.744 | tok/s=59,003 | ETA=8:12:22 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:21:37,261 | step= 4120 | epoch=0.71 | loss=2.1590 | ppl=8.66 | lr=8.58e-05 | grad=0.678 | tok/s=59,082 | ETA=8:05:48 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:27:32,467 | step= 4140 | epoch=0.72 | loss=2.1848 | ppl=8.89 | lr=8.46e-05 | grad=0.804 | tok/s=59,064 | ETA=8:00:03 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:33:27,871 | step= 4160 | epoch=0.72 | loss=2.2006 | ppl=9.03 | lr=8.33e-05 | grad=0.741 | tok/s=58,955 | ETA=7:55:00 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:39:23,444 | step= 4180 | epoch=0.73 | loss=2.1747 | ppl=8.80 | lr=8.21e-05 | grad=0.745 | tok/s=58,924 | ETA=7:49:20 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:45:19,095 | step= 4200 | epoch=0.73 | loss=2.1793 | ppl=8.84 | lr=8.09e-05 | grad=0.663 | tok/s=58,808 | ETA=7:44:19 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:51:14,977 | step= 4220 | epoch=0.73 | loss=2.1672 | ppl=8.73 | lr=7.97e-05 | grad=0.773 | tok/s=58,358 | ETA=7:41:55 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 04:57:10,628 | step= 4240 | epoch=0.74 | loss=2.1844 | ppl=8.88 | lr=7.85e-05 | grad=0.659 | tok/s=58,951 | ETA=7:31:21 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 05:03:06,176 | step= 4260 | epoch=0.74 | loss=2.1763 | ppl=8.81 | lr=7.73e-05 | grad=0.607 | tok/s=58,936 | ETA=7:25:32 | VRAM=0.7GB | RAM=20% |
| 2026-06-18 05:09:01,885 | step= 4280 | epoch=0.74 | loss=2.1723 | ppl=8.78 | lr=7.61e-05 | grad=0.659 | tok/s=58,879 | ETA=7:20:02 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 05:14:57,516 | step= 4300 | epoch=0.75 | loss=2.1619 | ppl=8.69 | lr=7.50e-05 | grad=0.802 | tok/s=58,946 | ETA=7:13:37 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 05:20:53,080 | step= 4320 | epoch=0.75 | loss=2.1522 | ppl=8.60 | lr=7.38e-05 | grad=0.739 | tok/s=58,948 | ETA=7:07:41 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 05:26:48,782 | step= 4340 | epoch=0.75 | loss=2.1752 | ppl=8.80 | lr=7.27e-05 | grad=0.739 | tok/s=58,890 | ETA=7:02:10 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 05:32:44,272 | step= 4360 | epoch=0.76 | loss=2.1713 | ppl=8.77 | lr=7.16e-05 | grad=0.670 | tok/s=58,941 | ETA=6:55:52 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 05:38:39,789 | step= 4380 | epoch=0.76 | loss=2.1523 | ppl=8.60 | lr=7.04e-05 | grad=0.725 | tok/s=58,925 | ETA=6:50:04 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 05:44:35,320 | step= 4400 | epoch=0.76 | loss=2.1495 | ppl=8.58 | lr=6.93e-05 | grad=0.697 | tok/s=58,902 | ETA=6:44:18 | VRAM=0.7GB | RAM=19% |
| 2026-06-18 05:47:07,315 | VAL step=4400 epoch=0.76 loss=2.4235 ppl=11.29 ★ BEST |
| 2026-06-18 05:47:07,484 | Checkpoint saved: step 4400 |
| 2026-06-18 05:47:07,574 | Saved safetensors: 101 tensors |
| 2026-06-18 05:47:07,892 | HTTP Request: POST https://huggingface.co/api/repos/create "HTTP/1.1 409 Conflict" |
| 2026-06-18 05:47:07,931 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 05:47:08,389 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 05:47:08,450 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/preupload/main "HTTP/1.1 200 OK" |
| 2026-06-18 05:47:08,515 | HTTP Request: POST https://huggingface.co/GODELEV/Exp-1.git/info/lfs/objects/batch "HTTP/1.1 200 OK" |
| 2026-06-18 05:47:12,783 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/commit/main "HTTP/1.1 200 OK" |
| 2026-06-18 05:47:12,786 | Hub push step=4400 |
| 2026-06-18 05:53:09,163 | step= 4420 | epoch=0.77 | loss=2.1645 | ppl=8.71 | lr=6.83e-05 | grad=0.684 | tok/s=58,881 | ETA=6:38:31 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 05:59:04,603 | step= 4440 | epoch=0.77 | loss=2.1749 | ppl=8.80 | lr=6.72e-05 | grad=0.728 | tok/s=58,949 | ETA=6:32:07 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:05:00,566 | step= 4460 | epoch=0.77 | loss=2.1832 | ppl=8.87 | lr=6.61e-05 | grad=0.650 | tok/s=58,907 | ETA=6:26:29 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:10:56,249 | step= 4480 | epoch=0.78 | loss=2.1404 | ppl=8.50 | lr=6.51e-05 | grad=0.639 | tok/s=58,854 | ETA=6:20:54 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:16:51,955 | step= 4500 | epoch=0.78 | loss=2.1508 | ppl=8.59 | lr=6.40e-05 | grad=0.728 | tok/s=58,902 | ETA=6:14:39 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:22:47,781 | step= 4520 | epoch=0.78 | loss=2.1698 | ppl=8.76 | lr=6.30e-05 | grad=0.635 | tok/s=58,855 | ETA=6:09:01 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:28:43,656 | step= 4540 | epoch=0.79 | loss=2.1576 | ppl=8.65 | lr=6.20e-05 | grad=0.733 | tok/s=58,877 | ETA=6:02:57 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:34:39,423 | step= 4560 | epoch=0.79 | loss=2.1463 | ppl=8.55 | lr=6.10e-05 | grad=0.678 | tok/s=58,941 | ETA=5:56:38 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:40:35,087 | step= 4580 | epoch=0.79 | loss=2.1636 | ppl=8.70 | lr=6.00e-05 | grad=0.633 | tok/s=58,934 | ETA=5:50:45 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:46:31,028 | step= 4600 | epoch=0.80 | loss=2.1501 | ppl=8.59 | lr=5.91e-05 | grad=0.621 | tok/s=58,889 | ETA=5:45:05 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:52:26,733 | step= 4620 | epoch=0.80 | loss=2.1704 | ppl=8.76 | lr=5.81e-05 | grad=0.653 | tok/s=58,782 | ETA=5:39:47 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 06:58:23,124 | step= 4640 | epoch=0.80 | loss=2.1470 | ppl=8.56 | lr=5.72e-05 | grad=0.695 | tok/s=58,927 | ETA=5:33:01 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:04:18,918 | step= 4660 | epoch=0.81 | loss=2.1557 | ppl=8.63 | lr=5.62e-05 | grad=0.671 | tok/s=58,969 | ETA=5:26:51 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:10:14,784 | step= 4680 | epoch=0.81 | loss=2.1546 | ppl=8.62 | lr=5.53e-05 | grad=0.687 | tok/s=58,887 | ETA=5:21:23 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:16:10,373 | step= 4700 | epoch=0.82 | loss=2.1704 | ppl=8.76 | lr=5.44e-05 | grad=0.611 | tok/s=58,956 | ETA=5:15:05 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:22:06,326 | step= 4720 | epoch=0.82 | loss=2.1731 | ppl=8.79 | lr=5.36e-05 | grad=0.657 | tok/s=58,874 | ETA=5:09:36 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:28:02,128 | step= 4740 | epoch=0.82 | loss=2.1466 | ppl=8.56 | lr=5.27e-05 | grad=0.669 | tok/s=58,874 | ETA=5:03:40 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:33:57,926 | step= 4760 | epoch=0.83 | loss=2.1569 | ppl=8.64 | lr=5.18e-05 | grad=0.657 | tok/s=58,915 | ETA=4:57:31 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:39:53,810 | step= 4780 | epoch=0.83 | loss=2.1465 | ppl=8.56 | lr=5.10e-05 | grad=0.668 | tok/s=58,825 | ETA=4:52:03 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:45:49,742 | step= 4800 | epoch=0.83 | loss=2.1479 | ppl=8.57 | lr=5.02e-05 | grad=0.650 | tok/s=58,801 | ETA=4:46:13 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:47:49,396 | VAL step=4800 epoch=0.83 loss=2.4145 ppl=11.18 ★ BEST |
| 2026-06-18 07:47:49,582 | Checkpoint saved: step 4800 |
| 2026-06-18 07:47:49,660 | Saved safetensors: 101 tensors |
| 2026-06-18 07:47:50,051 | HTTP Request: POST https://huggingface.co/api/repos/create "HTTP/1.1 409 Conflict" |
| 2026-06-18 07:47:50,089 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 07:47:50,546 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 07:47:50,638 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/preupload/main "HTTP/1.1 200 OK" |
| 2026-06-18 07:47:50,721 | HTTP Request: POST https://huggingface.co/GODELEV/Exp-1.git/info/lfs/objects/batch "HTTP/1.1 200 OK" |
| 2026-06-18 07:47:57,173 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/commit/main "HTTP/1.1 200 OK" |
| 2026-06-18 07:47:57,176 | Hub push step=4800 |
| 2026-06-18 07:53:53,592 | step= 4820 | epoch=0.84 | loss=2.1503 | ppl=8.59 | lr=4.94e-05 | grad=0.635 | tok/s=58,943 | ETA=4:39:37 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 07:59:48,665 | step= 4840 | epoch=0.84 | loss=2.1422 | ppl=8.52 | lr=4.86e-05 | grad=0.676 | tok/s=58,983 | ETA=4:33:30 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:05:45,366 | step= 4860 | epoch=0.84 | loss=2.1280 | ppl=8.40 | lr=4.78e-05 | grad=0.635 | tok/s=58,931 | ETA=4:27:49 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:11:41,029 | step= 4880 | epoch=0.85 | loss=2.1451 | ppl=8.54 | lr=4.70e-05 | grad=0.711 | tok/s=58,855 | ETA=4:22:14 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:17:36,485 | step= 4900 | epoch=0.85 | loss=2.1485 | ppl=8.57 | lr=4.63e-05 | grad=0.837 | tok/s=58,987 | ETA=4:15:43 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:23:31,704 | step= 4920 | epoch=0.85 | loss=2.1604 | ppl=8.67 | lr=4.56e-05 | grad=0.608 | tok/s=59,030 | ETA=4:09:37 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:29:27,000 | step= 4940 | epoch=0.86 | loss=2.1438 | ppl=8.53 | lr=4.49e-05 | grad=0.635 | tok/s=58,971 | ETA=4:03:57 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:35:22,321 | step= 4960 | epoch=0.86 | loss=2.1529 | ppl=8.61 | lr=4.42e-05 | grad=0.600 | tok/s=58,960 | ETA=3:58:04 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:41:18,291 | step= 4980 | epoch=0.86 | loss=2.1578 | ppl=8.65 | lr=4.35e-05 | grad=0.591 | tok/s=58,901 | ETA=3:52:23 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:47:14,140 | step= 5000 | epoch=0.87 | loss=2.1299 | ppl=8.41 | lr=4.28e-05 | grad=0.599 | tok/s=58,769 | ETA=3:46:58 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:53:10,622 | step= 5020 | epoch=0.87 | loss=2.1318 | ppl=8.43 | lr=4.22e-05 | grad=0.590 | tok/s=58,713 | ETA=3:41:14 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 08:59:07,332 | step= 5040 | epoch=0.87 | loss=2.1591 | ppl=8.66 | lr=4.15e-05 | grad=0.558 | tok/s=58,696 | ETA=3:35:21 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 09:05:03,847 | step= 5060 | epoch=0.88 | loss=2.1532 | ppl=8.61 | lr=4.09e-05 | grad=0.668 | tok/s=58,744 | ETA=3:29:13 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 09:11:00,104 | step= 5080 | epoch=0.88 | loss=2.1415 | ppl=8.51 | lr=4.03e-05 | grad=0.673 | tok/s=58,840 | ETA=3:22:57 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 09:16:55,699 | step= 5100 | epoch=0.88 | loss=2.1600 | ppl=8.67 | lr=3.97e-05 | grad=0.608 | tok/s=58,888 | ETA=3:16:51 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 09:22:52,911 | step= 5120 | epoch=0.89 | loss=2.1678 | ppl=8.74 | lr=3.91e-05 | grad=0.692 | tok/s=58,505 | ETA=3:12:11 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 09:28:48,912 | step= 5140 | epoch=0.89 | loss=2.1460 | ppl=8.55 | lr=3.86e-05 | grad=0.537 | tok/s=58,833 | ETA=3:05:10 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 09:34:44,722 | step= 5160 | epoch=0.90 | loss=2.1464 | ppl=8.55 | lr=3.81e-05 | grad=0.600 | tok/s=59,149 | ETA=2:58:17 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 09:40:41,012 | step= 5180 | epoch=0.90 | loss=2.1464 | ppl=8.55 | lr=3.75e-05 | grad=0.605 | tok/s=58,878 | ETA=2:53:10 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 09:46:36,702 | step= 5200 | epoch=0.90 | loss=2.1408 | ppl=8.51 | lr=3.70e-05 | grad=0.661 | tok/s=58,868 | ETA=2:47:16 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 09:48:36,310 | VAL step=5200 epoch=0.90 loss=2.4072 ppl=11.10 ★ BEST |
| 2026-06-18 09:48:36,491 | Checkpoint saved: step 5200 |
| 2026-06-18 09:48:36,569 | Saved safetensors: 101 tensors |
| 2026-06-18 09:48:36,940 | HTTP Request: POST https://huggingface.co/api/repos/create "HTTP/1.1 409 Conflict" |
| 2026-06-18 09:48:36,977 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 09:48:37,439 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 09:48:37,609 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/preupload/main "HTTP/1.1 200 OK" |
| 2026-06-18 09:48:37,742 | HTTP Request: POST https://huggingface.co/GODELEV/Exp-1.git/info/lfs/objects/batch "HTTP/1.1 200 OK" |
| 2026-06-18 09:48:41,770 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/commit/main "HTTP/1.1 200 OK" |
| 2026-06-18 09:48:41,772 | Hub push step=5200 |
| 2026-06-18 09:54:38,230 | step= 5220 | epoch=0.91 | loss=2.1563 | ppl=8.64 | lr=3.66e-05 | grad=0.680 | tok/s=58,916 | ETA=2:41:12 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:00:34,307 | step= 5240 | epoch=0.91 | loss=2.1407 | ppl=8.51 | lr=3.61e-05 | grad=0.605 | tok/s=57,784 | ETA=2:38:19 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:06:30,823 | step= 5260 | epoch=0.91 | loss=2.1221 | ppl=8.35 | lr=3.56e-05 | grad=0.559 | tok/s=58,884 | ETA=2:29:26 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:12:26,318 | step= 5280 | epoch=0.92 | loss=2.1375 | ppl=8.48 | lr=3.52e-05 | grad=0.588 | tok/s=58,963 | ETA=2:23:18 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:18:21,711 | step= 5300 | epoch=0.92 | loss=2.1381 | ppl=8.48 | lr=3.48e-05 | grad=0.700 | tok/s=58,931 | ETA=2:17:28 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:24:17,055 | step= 5320 | epoch=0.92 | loss=2.1475 | ppl=8.56 | lr=3.44e-05 | grad=0.669 | tok/s=58,948 | ETA=2:11:30 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:30:12,431 | step= 5340 | epoch=0.93 | loss=2.1377 | ppl=8.48 | lr=3.40e-05 | grad=0.586 | tok/s=58,989 | ETA=2:05:29 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:36:07,670 | step= 5360 | epoch=0.93 | loss=2.1639 | ppl=8.70 | lr=3.36e-05 | grad=0.590 | tok/s=59,128 | ETA=1:59:17 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:42:03,441 | step= 5380 | epoch=0.93 | loss=2.1365 | ppl=8.47 | lr=3.33e-05 | grad=0.587 | tok/s=58,470 | ETA=1:54:39 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:47:58,839 | step= 5400 | epoch=0.94 | loss=2.1412 | ppl=8.51 | lr=3.30e-05 | grad=0.632 | tok/s=59,087 | ETA=1:47:33 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:53:53,295 | step= 5420 | epoch=0.94 | loss=2.1404 | ppl=8.50 | lr=3.26e-05 | grad=0.587 | tok/s=59,015 | ETA=1:41:46 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 10:59:47,797 | step= 5440 | epoch=0.94 | loss=2.1475 | ppl=8.56 | lr=3.23e-05 | grad=0.624 | tok/s=59,043 | ETA=1:35:48 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 11:05:42,383 | step= 5460 | epoch=0.95 | loss=2.1354 | ppl=8.46 | lr=3.21e-05 | grad=0.589 | tok/s=58,978 | ETA=1:29:59 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 11:11:37,498 | step= 5480 | epoch=0.95 | loss=2.1549 | ppl=8.63 | lr=3.18e-05 | grad=0.658 | tok/s=58,888 | ETA=1:24:12 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 11:17:33,117 | step= 5500 | epoch=0.95 | loss=2.1246 | ppl=8.37 | lr=3.16e-05 | grad=0.540 | tok/s=58,967 | ETA=1:18:09 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 11:23:28,765 | step= 5520 | epoch=0.96 | loss=2.1320 | ppl=8.43 | lr=3.13e-05 | grad=0.583 | tok/s=58,943 | ETA=1:12:16 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 11:29:24,112 | step= 5540 | epoch=0.96 | loss=2.1344 | ppl=8.45 | lr=3.11e-05 | grad=0.557 | tok/s=58,939 | ETA=1:06:21 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 11:35:19,661 | step= 5560 | epoch=0.96 | loss=2.1237 | ppl=8.36 | lr=3.09e-05 | grad=0.690 | tok/s=58,882 | ETA=1:00:29 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 11:41:15,838 | step= 5580 | epoch=0.97 | loss=2.1571 | ppl=8.65 | lr=3.08e-05 | grad=0.596 | tok/s=58,733 | ETA=0:54:41 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 11:47:12,564 | step= 5600 | epoch=0.97 | loss=2.1428 | ppl=8.52 | lr=3.06e-05 | grad=0.633 | tok/s=58,129 | ETA=0:49:15 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 11:49:10,996 | VAL step=5600 epoch=0.97 loss=2.4024 ppl=11.05 ★ BEST |
| 2026-06-18 11:49:11,180 | Checkpoint saved: step 5600 |
| 2026-06-18 11:49:11,286 | Saved safetensors: 101 tensors |
| 2026-06-18 11:49:11,676 | HTTP Request: POST https://huggingface.co/api/repos/create "HTTP/1.1 409 Conflict" |
| 2026-06-18 11:49:11,716 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 11:49:12,178 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 11:49:12,335 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/preupload/main "HTTP/1.1 200 OK" |
| 2026-06-18 11:49:12,396 | HTTP Request: POST https://huggingface.co/GODELEV/Exp-1.git/info/lfs/objects/batch "HTTP/1.1 200 OK" |
| 2026-06-18 11:49:17,215 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/commit/main "HTTP/1.1 200 OK" |
| 2026-06-18 11:49:17,217 | Hub push step=5600 |
| 2026-06-18 11:55:12,213 | step= 5620 | epoch=0.98 | loss=2.1410 | ppl=8.51 | lr=3.05e-05 | grad=0.698 | tok/s=59,725 | ETA=0:42:05 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 12:01:07,577 | step= 5640 | epoch=0.98 | loss=2.1294 | ppl=8.41 | lr=3.03e-05 | grad=0.588 | tok/s=58,918 | ETA=0:36:44 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 12:07:03,160 | step= 5660 | epoch=0.98 | loss=2.1442 | ppl=8.53 | lr=3.02e-05 | grad=0.555 | tok/s=58,866 | ETA=0:30:50 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 12:12:59,818 | step= 5680 | epoch=0.99 | loss=2.1615 | ppl=8.68 | lr=3.02e-05 | grad=0.675 | tok/s=58,721 | ETA=0:24:58 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 12:18:56,748 | step= 5700 | epoch=0.99 | loss=2.1203 | ppl=8.33 | lr=3.01e-05 | grad=0.549 | tok/s=58,725 | ETA=0:19:01 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 12:24:52,881 | step= 5720 | epoch=0.99 | loss=2.1371 | ppl=8.48 | lr=3.00e-05 | grad=0.619 | tok/s=59,179 | ETA=0:12:58 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 12:30:49,389 | step= 5740 | epoch=1.00 | loss=2.1280 | ppl=8.40 | lr=3.00e-05 | grad=0.688 | tok/s=58,874 | ETA=0:07:07 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 12:36:45,707 | step= 5760 | epoch=1.00 | loss=2.1222 | ppl=8.35 | lr=3.00e-05 | grad=0.624 | tok/s=58,689 | ETA=0:01:11 | VRAM=0.7GB | RAM=22% |
| 2026-06-18 12:39:56,040 | Final eval loss=2.4005 ppl=11.03 |
| 2026-06-18 12:39:56,272 | Checkpoint saved: step 5764 |
| 2026-06-18 12:39:56,377 | Saved safetensors: 101 tensors |
| 2026-06-18 12:39:56,792 | HTTP Request: POST https://huggingface.co/api/repos/create "HTTP/1.1 409 Conflict" |
| 2026-06-18 12:39:56,829 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 12:39:57,289 | HTTP Request: POST https://huggingface.co/api/validate-yaml "HTTP/1.1 200 OK" |
| 2026-06-18 12:39:57,352 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/preupload/main "HTTP/1.1 200 OK" |
| 2026-06-18 12:39:57,417 | HTTP Request: POST https://huggingface.co/GODELEV/Exp-1.git/info/lfs/objects/batch "HTTP/1.1 200 OK" |
| 2026-06-18 12:40:01,401 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/commit/main "HTTP/1.1 200 OK" |
| 2026-06-18 12:40:01,404 | Hub push step=5764 |
| 2026-06-18 12:40:01,489 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/preupload/main "HTTP/1.1 200 OK" |
| 2026-06-18 12:40:02,002 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/commit/main "HTTP/1.1 200 OK" |
| 2026-06-18 12:40:02,076 | HTTP Request: POST https://huggingface.co/api/models/GODELEV/Exp-1/preupload/main "HTTP/1.1 200 OK" |
|
|