tokenizer.json:   0%|                                                                                                  | 0.00/11.4M [00:00<?, ?B/s]tokenizer.json:  27%|████████████████████████▎                                                                | 3.12M/11.4M [00:00<00:00, 11.5MB/s]tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 11.4M/11.4M [00:00<00:00, 32.5MB/s]
[2026-01-05 19:56:22,639] [INFO] [axolotl.utils.data.sft._load_raw_datasets:320] [PID:444] Loading raw datasets...
[2026-01-05 19:56:23,255] [INFO] [axolotl.utils.data.wrappers.get_dataset_wrapper:87] [PID:444] Loading dataset: WNT3D/Ultimate-Offensive-Red-Team with base_type: alpaca and prompt_style: None
Dropping Long Sequences (>4096) (num_proc=4):   0%|                                                               | 0/25620 [00:00<?, ? examples/s]Dropping Long Sequences (>4096) (num_proc=4):   4%|█▉                                                | 1000/25620 [00:00<00:05, 4469.12 examples/s]Dropping Long Sequences (>4096) (num_proc=4):  23%|███████████▍                                     | 6000/25620 [00:00<00:00, 21720.64 examples/s]Dropping Long Sequences (>4096) (num_proc=4):  43%|████████████████████▌                           | 11000/25620 [00:00<00:00, 31422.54 examples/s]Dropping Long Sequences (>4096) (num_proc=4):  66%|███████████████████████████████▊                | 17000/25620 [00:00<00:00, 34083.34 examples/s]Dropping Long Sequences (>4096) (num_proc=4):  86%|█████████████████████████████████████████▏      | 22000/25620 [00:00<00:00, 38301.21 examples/s]Dropping Long Sequences (>4096) (num_proc=4): 100%|████████████████████████████████████████████████| 25620/25620 [00:00<00:00, 29329.19 examples/s]
Saving the dataset (0/4 shards):   0%|                                                                            | 0/25103 [00:00<?, ? examples/s]Saving the dataset (0/4 shards):  16%|█████████▉                                                    | 4000/25103 [00:00<00:00, 22838.14 examples/s]Saving the dataset (1/4 shards): 100%|█████████████████████████████████████████████████████████████| 25103/25103 [00:00<00:00, 22838.14 examples/s]Saving the dataset (2/4 shards): 100%|█████████████████████████████████████████████████████████████| 25103/25103 [00:00<00:00, 22838.14 examples/s]Saving the dataset (3/4 shards): 100%|█████████████████████████████████████████████████████████████| 25103/25103 [00:00<00:00, 22838.14 examples/s]Saving the dataset (4/4 shards): 100%|█████████████████████████████████████████████████████████████| 25103/25103 [00:00<00:00, 22838.14 examples/s]Saving the dataset (4/4 shards): 100%|█████████████████████████████████████████████████████████████| 25103/25103 [00:00<00:00, 81840.32 examples/s]
Loading checkpoint shards:   0%|                                                                                             | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|█████████████████                                                                    | 1/5 [00:00<00:03,  1.04it/s]Loading checkpoint shards:  40%|██████████████████████████████████                                                   | 2/5 [00:02<00:03,  1.13s/it]Loading checkpoint shards:  60%|███████████████████████████████████████████████████                                  | 3/5 [00:03<00:02,  1.16s/it]Loading checkpoint shards:  80%|████████████████████████████████████████████████████████████████████                 | 4/5 [00:04<00:01,  1.00s/it]Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:04<00:00,  1.32it/s]Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:04<00:00,  1.11it/s]
generation_config.json:   0%|                                                                                            | 0.00/239 [00:00<?, ?B/s]generation_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████| 239/239 [00:00<00:00, 435kB/s]