| [2026-02-01 21:46:32,649] [INFO] [axolotl.utils.data.sft._load_raw_datasets:320] [PID:13534] Loading raw datasets... | |
| [2026-02-01 21:46:32,995] [INFO] [axolotl.utils.data.wrappers.get_dataset_wrapper:87] [PID:13534] Loading dataset: mmlu.jsonl with base_type: chat_template and prompt_style: None | |
| Dropping Long Sequences (>4096) (num_proc=48): 0%| | 0/390 [00:00<?, ? examples/s] Dropping Long Sequences (>4096) (num_proc=48): 2%|ββ | 9/390 [00:00<00:29, 13.07 examples/s] Dropping Long Sequences (>4096) (num_proc=48): 12%|βββββββββββ | 45/390 [00:00<00:04, 71.27 examples/s] Dropping Long Sequences (>4096) (num_proc=48): 18%|βββββββββββββββββ | 70/390 [00:00<00:03, 105.68 examples/s] Dropping Long Sequences (>4096) (num_proc=48): 24%|ββββββββββββββββββββββ | 94/390 [00:01<00:02, 129.79 examples/s] Dropping Long Sequences (>4096) (num_proc=48): 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 390/390 [00:01<00:00, 301.67 examples/s] | |
| Saving the dataset (0/1 shards): 0%| | 0/390 [00:00<?, ? examples/s] Saving the dataset (1/1 shards): 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 390/390 [00:00<00:00, 12025.22 examples/s] Saving the dataset (1/1 shards): 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 390/390 [00:00<00:00, 11764.30 examples/s] | |
| generation_config.json: 0%| | 0.00/242 [00:00<?, ?B/s] generation_config.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 242/242 [00:00<00:00, 709kB/s] | |