[2025-11-26 03:25:26,747] [INFO] [axolotl.utils.data.sft._load_raw_datasets:320] [PID:64100] Loading raw datasets... [2025-11-26 03:25:29,268] [INFO] [axolotl.utils.data.wrappers.get_dataset_wrapper:87] [PID:64100] Loading dataset: ToastyPigeon/limarp-augmented-train-last-only with base_type: chat_template and prompt_style: None [2025-11-26 03:25:29,972] [WARNING] [huggingface_hub.repocard.content:108] [PID:64100] Repo card metadata block was not found. Setting CardData to empty. [2025-11-26 03:25:30,754] [INFO] [axolotl.utils.data.wrappers.get_dataset_wrapper:87] [PID:64100] Loading dataset: ToastyPigeon/mixed-medical-reasoning-formatted with base_type: chat_template and prompt_style: None [2025-11-26 03:25:32,584] [WARNING] [huggingface_hub.repocard.content:108] [PID:64100] Repo card metadata block was not found. Setting CardData to empty. [2025-11-26 03:25:33,613] [INFO] [axolotl.utils.data.wrappers.get_dataset_wrapper:87] [PID:64100] Loading dataset: ToastyPigeon/kimi-stories-instruct with base_type: chat_template and prompt_style: None [2025-11-26 03:25:35,627] [INFO] [axolotl.utils.data.wrappers.get_dataset_wrapper:87] [PID:64100] Loading dataset: allura-forge/koto-instruct-sft-nothink with base_type: chat_template and prompt_style: None [2025-11-26 03:25:36,310] [WARNING] [huggingface_hub.repocard.content:108] [PID:64100] Repo card metadata block was not found. Setting CardData to empty. [2025-11-26 03:25:37,341] [INFO] [axolotl.utils.data.wrappers.get_dataset_wrapper:87] [PID:64100] Loading dataset: ToastyPigeon/SpringDragon-Instruct with base_type: chat_template and prompt_style: None Tokenizing Prompts (num_proc=24): 0%| | 0/2535 [00:004096) (num_proc=24): 0%| | 0/132318 [00:004096) (num_proc=24): 1%|▏ | 1000/132318 [00:01<02:22, 922.06 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 8%|█▎ | 10000/132318 [00:01<00:11, 11073.18 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 16%|██▋ | 21000/132318 [00:01<00:04, 24182.19 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 21%|███▌ | 28000/132318 [00:01<00:04, 23802.12 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 30%|█████▏ | 40000/132318 [00:01<00:02, 37748.88 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 36%|██████▏ | 48000/132318 [00:01<00:02, 41263.96 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 42%|███████ | 55000/132318 [00:02<00:02, 34709.41 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 48%|████████ | 63000/132318 [00:02<00:01, 41579.13 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 53%|████████▉ | 70000/132318 [00:02<00:01, 46552.60 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 58%|█████████▉ | 77000/132318 [00:02<00:01, 37040.55 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 66%|███████████▏ | 87000/132318 [00:02<00:00, 47883.31 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 71%|████████████ | 94000/132318 [00:02<00:00, 48432.39 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 76%|████████████ | 100000/132318 [00:03<00:00, 38109.09 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 82%|█████████████▏ | 109000/132318 [00:03<00:00, 47152.44 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 88%|██████████████ | 116514/132318 [00:03<00:00, 52490.56 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 94%|███████████████ | 124105/132318 [00:03<00:00, 57668.18 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 99%|███████████████▉| 131292/132318 [00:03<00:00, 52604.65 examples/s] Dropping Long Sequences (>4096) (num_proc=24): 100%|████████████████| 132318/132318 [00:03<00:00, 33596.79 examples/s] Drop Samples with Zero Trainable Tokens (num_proc=24): 0%| | 0/123829 [00:00