[2026-06-23 12:02:57,317] [DEBUG] [axolotl.utils.config.resolve_dtype:74] [PID:2077] bf16 support detected, enabling for this configuration. config.json: 0%| | 0.00/661 [00:00 [2026-06-23 12:03:04,497] [DEBUG] [axolotl.loaders.tokenizer.load_tokenizer:312] [PID:2077] BOS: None / None [2026-06-23 12:03:04,497] [DEBUG] [axolotl.loaders.tokenizer.load_tokenizer:313] [PID:2077] PAD: 151643 / <|endoftext|> [2026-06-23 12:03:04,497] [DEBUG] [axolotl.loaders.tokenizer.load_tokenizer:314] [PID:2077] UNK: None / None [2026-06-23 12:03:04,497] [INFO] [axolotl.utils.data.shared.load_preprocessed_dataset:482] [PID:2077] Unable to find prepared dataset in out/prepared_full/629a2e5ec728ba197df2909eccb74717 [2026-06-23 12:03:04,498] [INFO] [axolotl.utils.data.sft._load_raw_datasets:320] [PID:2077] Loading raw datasets... [2026-06-23 12:03:04,498] [WARNING] [axolotl.utils.data.sft._load_raw_datasets:322] [PID:2077] Processing datasets during training can lead to VRAM instability. Please pre-process your dataset using `axolotl preprocess path/to/config.yml`. Downloading (incomplete total...): 0.00B [00:00, ?B/s] Fetching 0 files: 0it [00:00, ?it/s] Fetching 0 files: 0it [00:00, ?it/s] Download complete: : 0.00B [00:00, ?B/s] README.md: 0%| | 0.00/602 [00:002048) (num_proc=12): 0%| | 0/1800 [00:002048) (num_proc=12): 8%|█████████████████████▌ | 150/1800 [00:00<00:02, 717.96 examples/s] Dropping Invalid Sequences (2048) (num_proc=12): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1800/1800 [00:00<00:00, 5144.28 examples/s] Saving the dataset (0/7 shards): 0%| | 0/1800 [00:00 [2026-06-23 12:03:52,290] [DEBUG] [axolotl.loaders.tokenizer.load_tokenizer:312] [PID:2077] BOS: None / None [2026-06-23 12:03:52,290] [DEBUG] [axolotl.loaders.tokenizer.load_tokenizer:313] [PID:2077] PAD: 151643 / <|endoftext|> [2026-06-23 12:03:52,290] [DEBUG] [axolotl.loaders.tokenizer.load_tokenizer:314] [PID:2077] UNK: None / None [2026-06-23 12:03:52,290] [DEBUG] [axolotl.train.setup_model_and_tokenizer:81] [PID:2077] Loading model [2026-06-23 12:03:52,531] [DEBUG] [axolotl.monkeypatch.torchao_optim.patch_torchao_optim_state_8bit:75] [PID:2077] Patched OptimState8bit for torch.compile compatibility [2026-06-23 12:03:52,532] [DEBUG] [axolotl.monkeypatch.torchao_optim.patch_torchao_optim_state_8bit:122] [PID:2077] Patched OptimState4bit for torch.compile compatibility [2026-06-23 12:03:52,532] [DEBUG] [axolotl.monkeypatch.torchao_optim.patch_torchao_optim_state_8bit:154] [PID:2077] Patched OptimStateFp8 for torch.compile compatibility [2026-06-23 12:03:52,540] [DEBUG] [axolotl.monkeypatch.transformers.trainer_loss_calc.patch_evaluation_loop:94] [PID:2077] Patched Trainer.evaluation_loop with nanmean loss calculation [2026-06-23 12:03:52,541] [DEBUG] [axolotl.monkeypatch.transformers.trainer_loss_calc.patch_maybe_log_save_evaluate:148] [PID:2077] Patched Trainer._maybe_log_save_evaluate with nanmean loss calculation [2026-06-23 12:03:52,543] [WARNING] [axolotl.loaders.patch_manager._apply_self_attention_lora_patch:662] [PID:2077] Cannot patch self-attention - requires no dropout model.safetensors.index.json: 0%| | 0.00/35.6k [00:00