| 2026-03-15 13:21:26,604 | INFO | SFT data preparation started |
| 2026-03-15 13:21:26,604 | INFO | Log file: data/sft/processed/logs/prepare_sft_data_20260315_132126.log |
| 2026-03-15 13:21:26,605 | INFO | Arguments | config=configs/sft_data_smoltalk.json tokenizer_dir=data/tokenizer output_dir=data/sft/processed seq_len=2048 seed=42 |
| 2026-03-15 13:21:26,605 | INFO | SFT mixture config | num_sources=8 val_examples=2000 max_train_examples=200000 |
| 2026-03-15 13:21:26,605 | INFO | SFT packing config | seq_len=2048 min_supervised_tokens=16 |
| 2026-03-15 13:21:26,605 | INFO | SFT source[0] | name=smol_magpie_ultra path=HuggingFaceTB/smoltalk config_name=smol-magpie-ultra split=train format=messages streaming=False weight=0.4 row_filters={'quality': 'good'} val_target=800 train_target=80000 |
| 2026-03-15 13:21:26,605 | INFO | SFT source[1] | name=openhermes path=HuggingFaceTB/smoltalk config_name=openhermes-100k split=train format=messages streaming=False weight=0.15 row_filters=None val_target=300 train_target=30000 |
| 2026-03-15 13:21:26,605 | INFO | SFT source[2] | name=self_oss_instruct path=HuggingFaceTB/smoltalk config_name=self-oss-instruct split=train format=messages streaming=False weight=0.15 row_filters=None val_target=300 train_target=30000 |
| 2026-03-15 13:21:26,605 | INFO | SFT source[3] | name=everyday_conversations path=HuggingFaceTB/smoltalk config_name=everyday-conversations split=train format=messages streaming=False weight=0.01 row_filters=None val_target=20 train_target=2000 |
| 2026-03-15 13:21:26,605 | INFO | SFT source[4] | name=numina_cot path=HuggingFaceTB/smoltalk config_name=numina-cot-100k split=train format=messages streaming=False weight=0.1 row_filters=None val_target=200 train_target=20000 |
| 2026-03-15 13:21:26,605 | INFO | SFT source[5] | name=metamathqa path=HuggingFaceTB/smoltalk config_name=metamathqa-50k split=train format=messages streaming=False weight=0.05 row_filters=None val_target=100 train_target=10000 |
| 2026-03-15 13:21:26,605 | INFO | SFT source[6] | name=longalign path=HuggingFaceTB/smoltalk config_name=longalign split=train format=messages streaming=False weight=0.015 row_filters=None val_target=30 train_target=3000 |
| 2026-03-15 13:21:26,605 | INFO | SFT source[7] | name=ultrachat_200k path=HuggingFaceH4/ultrachat_200k config_name=None split=train_sft format=messages streaming=False weight=0.125 row_filters=None val_target=250 train_target=25000 |
| 2026-03-15 13:21:26,605 | INFO | Tokenizer special ids | bos=1 eos=2 pad=0 |
| 2026-03-15 13:21:26,606 | INFO | Loading SFT source | name=smol_magpie_ultra |
| 2026-03-15 13:21:49,343 | INFO | SFT progress | processed=5,000 train_examples=4,200 val_examples=800 skipped=2,212 |
| 2026-03-15 13:22:07,970 | INFO | SFT progress | processed=10,000 train_examples=9,200 val_examples=800 skipped=4,536 |
| 2026-03-15 13:22:26,634 | INFO | SFT progress | processed=15,000 train_examples=14,200 val_examples=800 skipped=6,798 |
| 2026-03-15 13:22:44,959 | INFO | SFT progress | processed=20,000 train_examples=19,200 val_examples=800 skipped=9,047 |
| 2026-03-15 13:23:03,316 | INFO | SFT progress | processed=25,000 train_examples=24,200 val_examples=800 skipped=11,398 |
| 2026-03-15 13:23:21,705 | INFO | SFT progress | processed=30,000 train_examples=29,200 val_examples=800 skipped=13,716 |
| 2026-03-15 13:23:39,935 | INFO | SFT progress | processed=35,000 train_examples=34,200 val_examples=800 skipped=15,985 |
| 2026-03-15 13:23:58,367 | INFO | SFT progress | processed=40,000 train_examples=39,200 val_examples=800 skipped=18,284 |
| 2026-03-15 13:24:16,745 | INFO | SFT progress | processed=45,000 train_examples=44,200 val_examples=800 skipped=20,512 |
| 2026-03-15 13:24:35,169 | INFO | SFT progress | processed=50,000 train_examples=49,200 val_examples=800 skipped=22,749 |
| 2026-03-15 13:24:53,377 | INFO | SFT progress | processed=55,000 train_examples=54,200 val_examples=800 skipped=24,949 |
| 2026-03-15 13:25:11,868 | INFO | SFT progress | processed=60,000 train_examples=59,200 val_examples=800 skipped=27,188 |
| 2026-03-15 13:25:30,314 | INFO | SFT progress | processed=65,000 train_examples=64,200 val_examples=800 skipped=29,431 |
| 2026-03-15 13:25:48,714 | INFO | SFT progress | processed=70,000 train_examples=69,200 val_examples=800 skipped=31,716 |
| 2026-03-15 13:26:07,119 | INFO | SFT progress | processed=75,000 train_examples=74,200 val_examples=800 skipped=33,870 |
| 2026-03-15 13:26:25,775 | INFO | SFT progress | processed=80,000 train_examples=79,200 val_examples=800 skipped=36,145 |
| 2026-03-15 13:26:28,721 | INFO | Completed SFT source | name=smol_magpie_ultra train=80,000/80000 val=800/800 seen=117,281 skipped=36,481 |
| 2026-03-15 13:26:28,721 | INFO | Loading SFT source | name=openhermes |
| 2026-03-15 13:26:36,651 | INFO | SFT progress | processed=85,000 train_examples=83,900 val_examples=1,100 skipped=36,707 |
| 2026-03-15 13:26:42,553 | INFO | SFT progress | processed=90,000 train_examples=88,900 val_examples=1,100 skipped=36,961 |
| 2026-03-15 13:26:48,344 | INFO | SFT progress | processed=95,000 train_examples=93,900 val_examples=1,100 skipped=37,227 |
| 2026-03-15 13:26:54,249 | INFO | SFT progress | processed=100,000 train_examples=98,900 val_examples=1,100 skipped=37,516 |
| 2026-03-15 13:27:00,205 | INFO | SFT progress | processed=105,000 train_examples=103,900 val_examples=1,100 skipped=37,782 |
| 2026-03-15 13:27:06,261 | INFO | SFT progress | processed=110,000 train_examples=108,900 val_examples=1,100 skipped=38,065 |
| 2026-03-15 13:27:07,568 | INFO | Completed SFT source | name=openhermes train=30,000/30000 val=300/300 seen=31,945 skipped=1,645 |
| 2026-03-15 13:27:07,568 | INFO | Loading SFT source | name=self_oss_instruct |
| 2026-03-15 13:27:17,619 | INFO | SFT progress | processed=115,000 train_examples=113,600 val_examples=1,400 skipped=38,126 |
| 2026-03-15 13:27:22,498 | INFO | SFT progress | processed=120,000 train_examples=118,600 val_examples=1,400 skipped=38,126 |
| 2026-03-15 13:27:27,485 | INFO | SFT progress | processed=125,000 train_examples=123,600 val_examples=1,400 skipped=38,126 |
| 2026-03-15 13:27:32,482 | INFO | SFT progress | processed=130,000 train_examples=128,600 val_examples=1,400 skipped=38,126 |
| 2026-03-15 13:27:37,473 | INFO | SFT progress | processed=135,000 train_examples=133,600 val_examples=1,400 skipped=38,126 |
| 2026-03-15 13:27:42,522 | INFO | SFT progress | processed=140,000 train_examples=138,600 val_examples=1,400 skipped=38,126 |
| 2026-03-15 13:27:43,916 | INFO | Completed SFT source | name=self_oss_instruct train=30,000/30000 val=300/300 seen=30,300 skipped=0 |
| 2026-03-15 13:27:43,916 | INFO | Loading SFT source | name=everyday_conversations |
| 2026-03-15 13:27:49,524 | INFO | Completed SFT source | name=everyday_conversations train=2,000/2000 val=20/20 seen=2,020 skipped=0 |
| 2026-03-15 13:27:49,525 | INFO | Loading SFT source | name=numina_cot |
| 2026-03-15 13:27:56,930 | INFO | SFT progress | processed=145,000 train_examples=143,380 val_examples=1,620 skipped=38,126 |
| 2026-03-15 13:28:03,530 | INFO | SFT progress | processed=150,000 train_examples=148,380 val_examples=1,620 skipped=38,126 |
| 2026-03-15 13:28:09,916 | INFO | SFT progress | processed=155,000 train_examples=153,380 val_examples=1,620 skipped=38,126 |
| 2026-03-15 13:28:16,444 | INFO | SFT progress | processed=160,000 train_examples=158,380 val_examples=1,620 skipped=38,126 |
| 2026-03-15 13:28:21,164 | INFO | Completed SFT source | name=numina_cot train=20,000/20000 val=200/200 seen=20,200 skipped=0 |
| 2026-03-15 13:28:21,165 | INFO | Loading SFT source | name=metamathqa |
| 2026-03-15 13:28:26,153 | INFO | SFT progress | processed=165,000 train_examples=163,280 val_examples=1,720 skipped=38,126 |
| 2026-03-15 13:28:29,853 | INFO | SFT progress | processed=170,000 train_examples=168,280 val_examples=1,720 skipped=38,127 |
| 2026-03-15 13:28:32,549 | INFO | Completed SFT source | name=metamathqa train=10,000/10000 val=100/100 seen=10,104 skipped=4 |
| 2026-03-15 13:28:32,549 | INFO | Loading SFT source | name=longalign |
| 2026-03-15 13:29:03,538 | INFO | SFT progress | processed=175,000 train_examples=173,250 val_examples=1,750 skipped=38,130 |
| 2026-03-15 13:29:42,829 | INFO | Completed SFT source | name=longalign train=3,000/3000 val=30/30 seen=3,030 skipped=0 |
| 2026-03-15 13:29:42,830 | INFO | Loading SFT source | name=ultrachat_200k |
| 2026-03-15 13:29:56,989 | INFO | SFT progress | processed=180,000 train_examples=178,000 val_examples=2,000 skipped=38,130 |
| 2026-03-15 13:30:12,911 | INFO | SFT progress | processed=185,000 train_examples=183,000 val_examples=2,000 skipped=38,130 |
| 2026-03-15 13:30:28,635 | INFO | SFT progress | processed=190,000 train_examples=188,000 val_examples=2,000 skipped=38,130 |
| 2026-03-15 13:30:44,882 | INFO | SFT progress | processed=195,000 train_examples=193,000 val_examples=2,000 skipped=38,130 |
| 2026-03-15 13:31:01,202 | INFO | SFT progress | processed=200,000 train_examples=198,000 val_examples=2,000 skipped=38,130 |
| 2026-03-15 13:31:07,611 | INFO | Completed SFT source | name=ultrachat_200k train=25,000/25000 val=250/250 seen=25,250 skipped=0 |
| 2026-03-15 13:31:07,614 | INFO | SFT dataset saved | output_dir=data/sft/processed |
| 2026-03-15 13:31:07,615 | INFO | SFT summary | train_examples=200,000 val_examples=2,000 skipped_rows=38,130 |
| 2026-03-15 13:31:07,615 | INFO | SFT metadata saved | path=data/sft/processed/dataset_summary.json |
|
|