| nohup: ignoring input |
| WARNING:__main__: |
| ***************************************** |
| Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. |
| ***************************************** |
| Running on 8 GPU(s) |
| Using attn_implementation: target=flash_attention_2, draft=sdpa |
| Loading target model: /workspace/models/Qwen3-8B |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]`torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]`torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.79s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.90s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.85s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.93s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.80s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.92s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.92s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.91s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.85s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.90s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.86s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.90s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.85s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:06<00:09, 3.04s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:06<00:09, 3.04s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:06<00:09, 3.04s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.79s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.77s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.80s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.83s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.75s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:06, 3.00s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:09<00:06, 3.00s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:06, 3.00s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.54s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.59s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.60s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.58s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.56s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.94s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.30s/it] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.93s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.96s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.31s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.28s/it] |
|
|
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.96s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.97s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.31s/it] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.33s/it] |
| Loading base + LoRA adapter: /workspace/hanrui/syxin/Specforge/outputs/qwen3-8b-sft-32gpu-v3/epoch_3_step_9288 |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:11<00:02, 2.79s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:11<00:02, 2.79s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:11<00:02, 2.79s/it]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:12<00:00, 2.12s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:12<00:00, 2.12s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:12<00:00, 2.12s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:12<00:00, 2.48s/it] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:12<00:00, 2.48s/it] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:12<00:00, 2.48s/it] |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.87s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.90s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.90s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.92s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:02<00:11, 2.92s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:03<00:12, 3.06s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:03<00:12, 3.06s/it]
Loading checkpoint shards: 20%|ββ | 1/5 [00:03<00:12, 3.06s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.82s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.91s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.91s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.92s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.92s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:05<00:08, 2.95s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:06<00:09, 3.02s/it]
Loading checkpoint shards: 40%|ββββ | 2/5 [00:06<00:09, 3.02s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.70s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.84s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.84s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.84s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.84s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.71s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.88s/it]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:08<00:05, 2.88s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.47s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.64s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.64s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.64s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.64s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:10<00:00, 1.81s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:10<00:00, 2.20s/it] |
|
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:10<00:02, 2.47s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.98s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.98s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.34s/it] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.34s/it] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.98s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.98s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.34s/it] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.34s/it] |
| /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/peft/peft_model.py:598: UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_B.default.weight']. |
| warnings.warn(warn_message) |
|
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:11<00:02, 2.64s/it]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:11<00:02, 2.64s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.82s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.23s/it] |
| Using the latest cached version of the dataset since openai/gsm8k couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'main' at /workspace/hanrui/datasets/openai___gsm8k/main/0.0.0/cc7b047b6e5bb11b4f1af84efc572db110a51b3c (last modified on Tue Mar 17 13:17:15 2026). |
| /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/peft/peft_model.py:598: UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_B.default.weight']. |
| warnings.warn(warn_message) |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.97s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 1.97s/it]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.36s/it] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:11<00:00, 2.36s/it] |
| /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/peft/peft_model.py:598: UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_B.default.weight']. |
| warnings.warn(warn_message) |
| /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/peft/peft_model.py:598: UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_B.default.weight']. |
| warnings.warn(warn_message) |
| /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/peft/peft_model.py:598: UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_B.default.weight']. |
| warnings.warn(warn_message) |
| /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/peft/peft_model.py:598: UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_B.default.weight']. |
| warnings.warn(warn_message) |
| |
| ============================================================ |
| Benchmark: gsm8k (8 GPUs) |
| ============================================================ |
| Using the latest cached version of the dataset since openai/gsm8k couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'main' at /workspace/hanrui/datasets/openai___gsm8k/main/0.0.0/cc7b047b6e5bb11b4f1af84efc572db110a51b3c (last modified on Tue Mar 17 13:17:15 2026). |
| Total 128 samples, distributed across 8 GPUs |
|
[GPU0] gsm8k: 0%| | 0/16 [00:00<?, ?sample/s]Using the latest cached version of the dataset since openai/gsm8k couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'main' at /workspace/hanrui/datasets/openai___gsm8k/main/0.0.0/cc7b047b6e5bb11b4f1af84efc572db110a51b3c (last modified on Tue Mar 17 13:17:15 2026). |
| Using the latest cached version of the dataset since openai/gsm8k couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'main' at /workspace/hanrui/datasets/openai___gsm8k/main/0.0.0/cc7b047b6e5bb11b4f1af84efc572db110a51b3c (last modified on Tue Mar 17 13:17:15 2026). |
| Using the latest cached version of the dataset since openai/gsm8k couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'main' at /workspace/hanrui/datasets/openai___gsm8k/main/0.0.0/cc7b047b6e5bb11b4f1af84efc572db110a51b3c (last modified on Tue Mar 17 13:17:15 2026). |
| Using the latest cached version of the dataset since openai/gsm8k couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'main' at /workspace/hanrui/datasets/openai___gsm8k/main/0.0.0/cc7b047b6e5bb11b4f1af84efc572db110a51b3c (last modified on Tue Mar 17 13:17:15 2026). |
| /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/peft/peft_model.py:598: UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_B.default.weight']. |
| warnings.warn(warn_message) |
| /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/peft/peft_model.py:598: UserWarning: Found missing adapter keys while loading the checkpoint: ['base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.2.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.3.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.4.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.5.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.6.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.7.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.8.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.9.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.10.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.11.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.12.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.13.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.14.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.15.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.16.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.17.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.18.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.19.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.20.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.21.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.22.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.23.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.24.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.25.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.26.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.27.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.28.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.29.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.30.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.31.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.32.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.33.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.34.self_attn.o_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.k_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_A.default.weight', 'base_model.model.model.layers.35.self_attn.o_proj.lora_B.default.weight']. |
| warnings.warn(warn_message) |
| Using the latest cached version of the dataset since openai/gsm8k couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'main' at /workspace/hanrui/datasets/openai___gsm8k/main/0.0.0/cc7b047b6e5bb11b4f1af84efc572db110a51b3c (last modified on Tue Mar 17 13:17:15 2026). |
| Using the latest cached version of the dataset since openai/gsm8k couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'main' at /workspace/hanrui/datasets/openai___gsm8k/main/0.0.0/cc7b047b6e5bb11b4f1af84efc572db110a51b3c (last modified on Tue Mar 17 13:17:15 2026). |
|
[GPU0] gsm8k: 0%| | 0/16 [00:14<?, ?sample/s, accept_len=1.98]
[GPU0] gsm8k: 6%|β | 1/16 [00:14<03:30, 14.04s/sample, accept_len=1.98]
[GPU0] gsm8k: 6%|β | 1/16 [00:34<03:30, 14.04s/sample, accept_len=2.01]
[GPU0] gsm8k: 12%|ββ | 2/16 [00:34<04:13, 18.10s/sample, accept_len=2.01]
[GPU0] gsm8k: 12%|ββ | 2/16 [00:50<04:13, 18.10s/sample, accept_len=2.02]
[GPU0] gsm8k: 19%|ββ | 3/16 [00:50<03:41, 17.01s/sample, accept_len=2.02]
[GPU0] gsm8k: 19%|ββ | 3/16 [01:16<03:41, 17.01s/sample, accept_len=2.01]
[GPU0] gsm8k: 25%|βββ | 4/16 [01:16<04:05, 20.48s/sample, accept_len=2.01]
[GPU0] gsm8k: 25%|βββ | 4/16 [01:32<04:05, 20.48s/sample, accept_len=2.02]
[GPU0] gsm8k: 31%|ββββ | 5/16 [01:32<03:29, 19.04s/sample, accept_len=2.02]
[GPU0] gsm8k: 31%|ββββ | 5/16 [01:42<03:29, 19.04s/sample, accept_len=2.03]
[GPU0] gsm8k: 38%|ββββ | 6/16 [01:42<02:38, 15.83s/sample, accept_len=2.03]
[GPU0] gsm8k: 38%|ββββ | 6/16 [02:02<02:38, 15.83s/sample, accept_len=2.04]
[GPU0] gsm8k: 44%|βββββ | 7/16 [02:02<02:34, 17.19s/sample, accept_len=2.04]
[GPU0] gsm8k: 44%|βββββ | 7/16 [02:17<02:34, 17.19s/sample, accept_len=2.07]
[GPU0] gsm8k: 50%|βββββ | 8/16 [02:17<02:13, 16.63s/sample, accept_len=2.07]
[GPU0] gsm8k: 50%|βββββ | 8/16 [02:36<02:13, 16.63s/sample, accept_len=2.08]
[GPU0] gsm8k: 56%|ββββββ | 9/16 [02:36<02:00, 17.28s/sample, accept_len=2.08]
[GPU0] gsm8k: 56%|ββββββ | 9/16 [02:51<02:00, 17.28s/sample, accept_len=2.08]
[GPU0] gsm8k: 62%|βββββββ | 10/16 [02:51<01:39, 16.54s/sample, accept_len=2.08]
[GPU0] gsm8k: 62%|βββββββ | 10/16 [03:05<01:39, 16.54s/sample, accept_len=2.07]
[GPU0] gsm8k: 69%|βββββββ | 11/16 [03:05<01:18, 15.75s/sample, accept_len=2.07]
[GPU0] gsm8k: 69%|βββββββ | 11/16 [03:27<01:18, 15.75s/sample, accept_len=2.07]
[GPU0] gsm8k: 75%|ββββββββ | 12/16 [03:27<01:10, 17.64s/sample, accept_len=2.07]
[GPU0] gsm8k: 75%|ββββββββ | 12/16 [03:41<01:10, 17.64s/sample, accept_len=2.05]
[GPU0] gsm8k: 81%|βββββββββ | 13/16 [03:41<00:49, 16.44s/sample, accept_len=2.05]
[GPU0] gsm8k: 81%|βββββββββ | 13/16 [03:55<00:49, 16.44s/sample, accept_len=2.05]
[GPU0] gsm8k: 88%|βββββββββ | 14/16 [03:55<00:31, 15.90s/sample, accept_len=2.05]
[GPU0] gsm8k: 88%|βββββββββ | 14/16 [04:04<00:31, 15.90s/sample, accept_len=2.04]
[GPU0] gsm8k: 94%|ββββββββββ| 15/16 [04:04<00:13, 13.74s/sample, accept_len=2.04]
[GPU0] gsm8k: 94%|ββββββββββ| 15/16 [04:26<00:13, 13.74s/sample, accept_len=2.05]
[GPU0] gsm8k: 100%|ββββββββββ| 16/16 [04:26<00:00, 16.31s/sample, accept_len=2.05]
[GPU0] gsm8k: 100%|ββββββββββ| 16/16 [04:26<00:00, 16.68s/sample, accept_len=2.05] |
| Using the latest cached version of the dataset since openai/openai_humaneval couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Using the latest cached version of the dataset since openai/openai_humaneval couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Using the latest cached version of the dataset since openai/openai_humaneval couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Using the latest cached version of the dataset since openai/openai_humaneval couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Using the latest cached version of the dataset since openai/openai_humaneval couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'openai_humaneval' at /workspace/hanrui/datasets/openai___openai_humaneval/openai_humaneval/0.0.0/7dce6050a7d6d172f3cc5c32aa97f52fa1a2e544 (last modified on Tue Mar 17 17:01:06 2026). |
| Found the latest cached dataset configuration 'openai_humaneval' at /workspace/hanrui/datasets/openai___openai_humaneval/openai_humaneval/0.0.0/7dce6050a7d6d172f3cc5c32aa97f52fa1a2e544 (last modified on Tue Mar 17 17:01:06 2026). |
| Found the latest cached dataset configuration 'openai_humaneval' at /workspace/hanrui/datasets/openai___openai_humaneval/openai_humaneval/0.0.0/7dce6050a7d6d172f3cc5c32aa97f52fa1a2e544 (last modified on Tue Mar 17 17:01:06 2026). |
| Found the latest cached dataset configuration 'openai_humaneval' at /workspace/hanrui/datasets/openai___openai_humaneval/openai_humaneval/0.0.0/7dce6050a7d6d172f3cc5c32aa97f52fa1a2e544 (last modified on Tue Mar 17 17:01:06 2026). |
| Using the latest cached version of the dataset since openai/openai_humaneval couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Using the latest cached version of the dataset since openai/openai_humaneval couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'openai_humaneval' at /workspace/hanrui/datasets/openai___openai_humaneval/openai_humaneval/0.0.0/7dce6050a7d6d172f3cc5c32aa97f52fa1a2e544 (last modified on Tue Mar 17 17:01:06 2026). |
| Found the latest cached dataset configuration 'openai_humaneval' at /workspace/hanrui/datasets/openai___openai_humaneval/openai_humaneval/0.0.0/7dce6050a7d6d172f3cc5c32aa97f52fa1a2e544 (last modified on Tue Mar 17 17:01:06 2026). |
| /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/storage.py:414: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. |
| return torch.load(io.BytesIO(b)) |
| Found the latest cached dataset configuration 'openai_humaneval' at /workspace/hanrui/datasets/openai___openai_humaneval/openai_humaneval/0.0.0/7dce6050a7d6d172f3cc5c32aa97f52fa1a2e544 (last modified on Tue Mar 17 17:01:06 2026). |
|
|
| gsm8k Results: |
| Decoding speedup: 1.04x |
| Average Acceptance length: 2.04 |
| Acceptance length histogram: ['0.0%', '7.2%', '82.7%', '9.8%', '0.3%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%'] |
| Num responses: 128 |
|
|
| ============================================================ |
| Benchmark: humaneval (8 GPUs) |
| ============================================================ |
| Using the latest cached version of the dataset since openai/openai_humaneval couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'openai_humaneval' at /workspace/hanrui/datasets/openai___openai_humaneval/openai_humaneval/0.0.0/7dce6050a7d6d172f3cc5c32aa97f52fa1a2e544 (last modified on Tue Mar 17 17:01:06 2026). |
| Total 164 samples, distributed across 8 GPUs |
|
[GPU0] humaneval: 0%| | 0/21 [00:00<?, ?sample/s]
[GPU0] humaneval: 0%| | 0/21 [00:33<?, ?sample/s, accept_len=2.01]
[GPU0] humaneval: 5%|β | 1/21 [00:33<11:02, 33.12s/sample, accept_len=2.01]
[GPU0] humaneval: 5%|β | 1/21 [00:50<11:02, 33.12s/sample, accept_len=1.98]
[GPU0] humaneval: 10%|β | 2/21 [00:50<07:33, 23.87s/sample, accept_len=1.98]
[GPU0] humaneval: 10%|β | 2/21 [01:06<07:33, 23.87s/sample, accept_len=1.99]
[GPU0] humaneval: 14%|ββ | 3/21 [01:06<06:03, 20.22s/sample, accept_len=1.99]
[GPU0] humaneval: 14%|ββ | 3/21 [01:30<06:03, 20.22s/sample, accept_len=1.98]
[GPU0] humaneval: 19%|ββ | 4/21 [01:30<06:07, 21.63s/sample, accept_len=1.98]
[GPU0] humaneval: 19%|ββ | 4/21 [02:26<06:07, 21.63s/sample, accept_len=1.98]
[GPU0] humaneval: 24%|βββ | 5/21 [02:26<09:04, 34.01s/sample, accept_len=1.98]
[GPU0] humaneval: 24%|βββ | 5/21 [02:59<09:04, 34.01s/sample, accept_len=1.98]
[GPU0] humaneval: 29%|βββ | 6/21 [02:59<08:24, 33.64s/sample, accept_len=1.98]
[GPU0] humaneval: 29%|βββ | 6/21 [03:14<08:24, 33.64s/sample, accept_len=1.99]
[GPU0] humaneval: 33%|ββββ | 7/21 [03:14<06:29, 27.79s/sample, accept_len=1.99]
[GPU0] humaneval: 33%|ββββ | 7/21 [03:34<06:29, 27.79s/sample, accept_len=1.97]
[GPU0] humaneval: 38%|ββββ | 8/21 [03:34<05:28, 25.23s/sample, accept_len=1.97]
[GPU0] humaneval: 38%|ββββ | 8/21 [04:15<05:28, 25.23s/sample, accept_len=1.99]
[GPU0] humaneval: 43%|βββββ | 9/21 [04:15<06:03, 30.26s/sample, accept_len=1.99]
[GPU0] humaneval: 43%|βββββ | 9/21 [04:46<06:03, 30.26s/sample, accept_len=2.00]
[GPU0] humaneval: 48%|βββββ | 10/21 [04:46<05:34, 30.45s/sample, accept_len=2.00]
[GPU0] humaneval: 48%|βββββ | 10/21 [05:10<05:34, 30.45s/sample, accept_len=1.99]
[GPU0] humaneval: 52%|ββββββ | 11/21 [05:10<04:44, 28.49s/sample, accept_len=1.99]
[GPU0] humaneval: 52%|ββββββ | 11/21 [05:46<04:44, 28.49s/sample, accept_len=2.00]
[GPU0] humaneval: 57%|ββββββ | 12/21 [05:46<04:34, 30.53s/sample, accept_len=2.00]
[GPU0] humaneval: 57%|ββββββ | 12/21 [06:17<04:34, 30.53s/sample, accept_len=2.02]
[GPU0] humaneval: 62%|βββββββ | 13/21 [06:17<04:07, 30.94s/sample, accept_len=2.02]
[GPU0] humaneval: 62%|βββββββ | 13/21 [06:45<04:07, 30.94s/sample, accept_len=2.00]
[GPU0] humaneval: 67%|βββββββ | 14/21 [06:45<03:30, 30.01s/sample, accept_len=2.00]
[GPU0] humaneval: 67%|βββββββ | 14/21 [07:08<03:30, 30.01s/sample, accept_len=1.99]
[GPU0] humaneval: 71%|ββββββββ | 15/21 [07:08<02:46, 27.82s/sample, accept_len=1.99]
[GPU0] humaneval: 71%|ββββββββ | 15/21 [07:39<02:46, 27.82s/sample, accept_len=2.00]
[GPU0] humaneval: 76%|ββββββββ | 16/21 [07:39<02:23, 28.65s/sample, accept_len=2.00]
[GPU0] humaneval: 76%|ββββββββ | 16/21 [08:13<02:23, 28.65s/sample, accept_len=1.98]
[GPU0] humaneval: 81%|ββββββββ | 17/21 [08:13<02:01, 30.47s/sample, accept_len=1.98]
[GPU0] humaneval: 81%|ββββββββ | 17/21 [08:44<02:01, 30.47s/sample, accept_len=1.98]
[GPU0] humaneval: 86%|βββββββββ | 18/21 [08:44<01:32, 30.67s/sample, accept_len=1.98]
[GPU0] humaneval: 86%|βββββββββ | 18/21 [09:10<01:32, 30.67s/sample, accept_len=1.98]
[GPU0] humaneval: 90%|βββββββββ | 19/21 [09:10<00:58, 29.28s/sample, accept_len=1.98]
[GPU0] humaneval: 90%|βββββββββ | 19/21 [09:33<00:58, 29.28s/sample, accept_len=1.99]
[GPU0] humaneval: 95%|ββββββββββ| 20/21 [09:33<00:27, 27.30s/sample, accept_len=1.99]
[GPU0] humaneval: 95%|ββββββββββ| 20/21 [10:10<00:27, 27.30s/sample, accept_len=1.97]
[GPU0] humaneval: 100%|ββββββββββ| 21/21 [10:10<00:00, 30.06s/sample, accept_len=1.97]
[GPU0] humaneval: 100%|ββββββββββ| 21/21 [10:10<00:00, 29.05s/sample, accept_len=1.97] |
| Using the latest cached version of the dataset since HuggingFaceH4/mt_bench_prompts couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Using the latest cached version of the dataset since HuggingFaceH4/mt_bench_prompts couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Using the latest cached version of the dataset since HuggingFaceH4/mt_bench_prompts couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Using the latest cached version of the dataset since HuggingFaceH4/mt_bench_prompts couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___mt_bench_prompts/default/0.0.0/e3a795c5e9a82ee40611c416b8a7786c73198991 (last modified on Tue Mar 17 13:17:15 2026). |
| Using the latest cached version of the dataset since HuggingFaceH4/mt_bench_prompts couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Using the latest cached version of the dataset since HuggingFaceH4/mt_bench_prompts couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___mt_bench_prompts/default/0.0.0/e3a795c5e9a82ee40611c416b8a7786c73198991 (last modified on Tue Mar 17 13:17:15 2026). |
| Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___mt_bench_prompts/default/0.0.0/e3a795c5e9a82ee40611c416b8a7786c73198991 (last modified on Tue Mar 17 13:17:15 2026). |
| Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___mt_bench_prompts/default/0.0.0/e3a795c5e9a82ee40611c416b8a7786c73198991 (last modified on Tue Mar 17 13:17:15 2026). |
| Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___mt_bench_prompts/default/0.0.0/e3a795c5e9a82ee40611c416b8a7786c73198991 (last modified on Tue Mar 17 13:17:15 2026). |
| Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___mt_bench_prompts/default/0.0.0/e3a795c5e9a82ee40611c416b8a7786c73198991 (last modified on Tue Mar 17 13:17:15 2026). |
| Using the latest cached version of the dataset since HuggingFaceH4/mt_bench_prompts couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___mt_bench_prompts/default/0.0.0/e3a795c5e9a82ee40611c416b8a7786c73198991 (last modified on Tue Mar 17 13:17:15 2026). |
|
|
| humaneval Results: |
| Decoding speedup: 0.98x |
| Average Acceptance length: 2.00 |
| Acceptance length histogram: ['0.0%', '9.9%', '80.7%', '9.2%', '0.2%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%', '0.0%'] |
| Num responses: 164 |
|
|
| ============================================================ |
| Benchmark: mt-bench (8 GPUs) |
| ============================================================ |
| Using the latest cached version of the dataset since HuggingFaceH4/mt_bench_prompts couldn't be found on the Hugging Face Hub (offline mode is enabled). |
| Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___mt_bench_prompts/default/0.0.0/e3a795c5e9a82ee40611c416b8a7786c73198991 (last modified on Tue Mar 17 13:17:15 2026). |
| Total 80 samples, distributed across 8 GPUs |
|
[GPU0] mt-bench: 0%| | 0/10 [00:00<?, ?sample/s]
[GPU0] mt-bench: 0%| | 0/10 [03:54<?, ?sample/s, accept_len=1.94]
[GPU0] mt-bench: 10%|β | 1/10 [03:54<35:13, 234.83s/sample, accept_len=1.94]
[GPU0] mt-bench: 10%|β | 1/10 [04:13<35:13, 234.83s/sample, accept_len=1.92]
[GPU0] mt-bench: 20%|ββ | 2/10 [04:13<14:21, 107.71s/sample, accept_len=1.92]
[GPU0] mt-bench: 20%|ββ | 2/10 [06:27<14:21, 107.71s/sample, accept_len=1.93]
[GPU0] mt-bench: 30%|βββ | 3/10 [06:27<13:57, 119.67s/sample, accept_len=1.93]
[GPU0] mt-bench: 30%|βββ | 3/10 [10:04<13:57, 119.67s/sample, accept_len=1.94]
[GPU0] mt-bench: 40%|ββββ | 4/10 [10:04<15:49, 158.23s/sample, accept_len=1.94]
[GPU0] mt-bench: 40%|ββββ | 4/10 [10:43<15:49, 158.23s/sample, accept_len=2.02]
[GPU0] mt-bench: 50%|βββββ | 5/10 [10:43<09:35, 115.18s/sample, accept_len=2.02]
[GPU0] mt-bench: 50%|βββββ | 5/10 [12:17<09:35, 115.18s/sample, accept_len=2.03]
[GPU0] mt-bench: 60%|ββββββ | 6/10 [12:17<07:11, 107.94s/sample, accept_len=2.03]
[GPU0] mt-bench: 60%|ββββββ | 6/10 [15:19<07:11, 107.94s/sample, accept_len=2.02]
[GPU0] mt-bench: 70%|βββββββ | 7/10 [15:19<06:36, 132.26s/sample, accept_len=2.02]
[GPU0] mt-bench: 70%|βββββββ | 7/10 [15:24<06:36, 132.26s/sample, accept_len=2.04]
[GPU0] mt-bench: 80%|ββββββββ | 8/10 [15:24<03:03, 91.72s/sample, accept_len=2.04]
[GPU0] mt-bench: 80%|ββββββββ | 8/10 [16:29<03:03, 91.72s/sample, accept_len=2.04]
[GPU0] mt-bench: 90%|βββββββββ | 9/10 [16:29<01:23, 83.16s/sample, accept_len=2.04][rank7]:[E319 15:53:48.132088258 ProcessGroupNCCL.cpp:607] [Rank 7] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=5, OpType=ALLGATHER, NumelIn=1, NumelOut=8, Timeout(ms)=600000) ran for 600074 milliseconds before timing out. |
| [rank7]:[E319 15:53:48.132413394 ProcessGroupNCCL.cpp:1664] [PG 0 (default_pg) Rank 7] Exception (either an error or timeout) detected by watchdog at work: 5, last enqueued NCCL work: 5, last completed NCCL work: 4. |
| [rank7]:[E319 15:53:48.762836691 ProcessGroupNCCL.cpp:1709] [PG 0 (default_pg) Rank 7] Timeout at NCCL work: 5, last enqueued NCCL work: 5, last completed NCCL work: 4. |
| [rank7]:[E319 15:53:48.762873123 ProcessGroupNCCL.cpp:621] [Rank 7] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. |
| [rank7]:[E319 15:53:48.762882303 ProcessGroupNCCL.cpp:627] [Rank 7] To avoid data inconsistency, we are taking the entire process down. |
| [rank7]:[E319 15:53:49.764485702 ProcessGroupNCCL.cpp:1515] [PG 0 (default_pg) Rank 7] Process group watchdog thread terminated with exception: [Rank 7] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=5, OpType=ALLGATHER, NumelIn=1, NumelOut=8, Timeout(ms)=600000) ran for 600074 milliseconds before timing out. |
| Exception raised from checkTimeout at ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:609 (most recent call first): |
| frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f004b0cbf86 in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libc10.so) |
| frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional<std::chrono::duration<long, std::ratio<1l, 1000l> > >) + 0x1d2 (0x7efffd3738d2 in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libtorch_cuda.so) |
| frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7efffd37a313 in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libtorch_cuda.so) |
| frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7efffd37c6fc in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libtorch_cuda.so) |
| frame #4: <unknown function> + 0xdf0e6 (0x7f004e5e40e6 in /workspace/miniconda3/envs/dflash/bin/../lib/libstdc++.so.6) |
| frame #5: <unknown function> + 0x891f5 (0x7f0050ee61f5 in /usr/lib/x86_64-linux-gnu/libc.so.6) |
| frame #6: <unknown function> + 0x1098dc (0x7f0050f668dc in /usr/lib/x86_64-linux-gnu/libc.so.6) |
| |
| terminate called after throwing an instance of 'c10::DistBackendError' |
| what(): [PG 0 (default_pg) Rank 7] Process group watchdog thread terminated with exception: [Rank 7] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=5, OpType=ALLGATHER, NumelIn=1, NumelOut=8, Timeout(ms)=600000) ran for 600074 milliseconds before timing out. |
| Exception raised from checkTimeout at ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:609 (most recent call first): |
| frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f004b0cbf86 in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libc10.so) |
| frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional<std::chrono::duration<long, std::ratio<1l, 1000l> > >) + 0x1d2 (0x7efffd3738d2 in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libtorch_cuda.so) |
| frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7efffd37a313 in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libtorch_cuda.so) |
| frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7efffd37c6fc in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libtorch_cuda.so) |
| frame #4: <unknown function> + 0xdf0e6 (0x7f004e5e40e6 in /workspace/miniconda3/envs/dflash/bin/../lib/libstdc++.so.6) |
| frame #5: <unknown function> + 0x891f5 (0x7f0050ee61f5 in /usr/lib/x86_64-linux-gnu/libc.so.6) |
| frame #6: <unknown function> + 0x1098dc (0x7f0050f668dc in /usr/lib/x86_64-linux-gnu/libc.so.6) |
| |
| Exception raised from ncclCommWatchdog at ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1521 (most recent call first): |
| frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f004b0cbf86 in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libc10.so) |
| frame #1: <unknown function> + 0xe5aa84 (0x7efffd005a84 in /workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/lib/libtorch_cuda.so) |
| frame #2: <unknown function> + 0xdf0e6 (0x7f004e5e40e6 in /workspace/miniconda3/envs/dflash/bin/../lib/libstdc++.so.6) |
| frame #3: <unknown function> + 0x891f5 (0x7f0050ee61f5 in /usr/lib/x86_64-linux-gnu/libc.so.6) |
| frame #4: <unknown function> + 0x1098dc (0x7f0050f668dc in /usr/lib/x86_64-linux-gnu/libc.so.6) |
| |
| W0319 15:53:49.559000 140004854380352 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 4318 closing signal SIGTERM |
| W0319 15:53:49.560000 140004854380352 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 4319 closing signal SIGTERM |
| W0319 15:53:49.560000 140004854380352 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 4320 closing signal SIGTERM |
| W0319 15:53:49.561000 140004854380352 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 4321 closing signal SIGTERM |
| W0319 15:53:49.562000 140004854380352 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 4322 closing signal SIGTERM |
| W0319 15:53:49.562000 140004854380352 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 4323 closing signal SIGTERM |
| W0319 15:53:49.563000 140004854380352 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 4324 closing signal SIGTERM |
| E0319 15:53:52.882000 140004854380352 torch/distributed/elastic/multiprocessing/api.py:833] failed (exitcode: -6) local_rank: 7 (pid: 4325) of binary: /workspace/miniconda3/envs/dflash/bin/python3 |
| Traceback (most recent call last): |
| File "<frozen runpy>", line 198, in _run_module_as_main |
| File "<frozen runpy>", line 88, in _run_code |
| File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/run.py", line 905, in <module> |
| main() |
| File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 348, in wrapper |
| return f(*args, **kwargs) |
| ^^^^^^^^^^^^^^^^^^ |
| File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/run.py", line 901, in main |
| run(args) |
| File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in run |
| elastic_launch( |
| File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 133, in __call__ |
| return launch_agent(self._config, self._entrypoint, list(args)) |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent |
| raise ChildFailedError( |
| torch.distributed.elastic.multiprocessing.errors.ChildFailedError: |
| ============================================================ |
| /workspace/hanrui/syxin_old/eval_dflash_lora_inject.py FAILED |
| ------------------------------------------------------------ |
| Failures: |
| <NO_OTHER_FAILURES> |
| ------------------------------------------------------------ |
| Root Cause (first observed failure): |
| [0]: |
| time : 2026-03-19_15:53:49 |
| host : job-006ce80a7c47-20260302193512-5dcd4c9bbd-lkk5h |
| rank : 7 (local_rank: 7) |
| exitcode : -6 (pid: 4325) |
| error_file: <N/A> |
| traceback : Signal 6 (SIGABRT) received by PID 4325 |
| ============================================================ |
| |