How do I use this?

#38

by DMann4214 - opened Feb 5

Feb 5

I'm super confused, I don't see anyone else running into problems but I've run out of avenues to explore.
I'm trying to just run the default .json you've included. I've also tried the one embedded in the image provided. I get the same error no matter what.

CLIPLoader

Error(s) in loading state_dict for Llama2:
size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([151936, 1024]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
size mismatch for model.layers.0.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.0.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.0.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.0.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.0.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
etc...

I've tried updating ComfyUI (0.80.0), I've tried running a fresh venv, tried checking for node updates... nothing helps.

Sorry if the answer is really simple. I'm literally just importing your .json and hitting run with no luck.

SleepVeryHard

Feb 5

Their is an official template included in comfyui

Avaredra

Feb 6

Be sure to update!

VashXP

Feb 7

•

edited Feb 7

same problems, we have problem loading the clip, some suggestions?

CLIPLoader
Error(s) in loading state_dict for Llama2:
size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([151936, 1024]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
size mismatch for model.layers.0.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.0.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.0.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.0.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.0.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.0.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.0.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.0.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.0.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.1.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.1.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.1.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.1.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.1.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.1.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.1.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.1.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.1.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.2.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.2.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.2.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.2.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.2.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.2.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.2.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.2.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.2.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.3.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.3.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.3.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.3.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.3.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.3.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.3.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.3.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.3.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.4.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.4.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.4.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.4.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.4.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.4.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.4.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.4.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.4.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.5.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.5.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.5.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.5.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.5.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.5.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.5.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.5.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.5.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.6.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.6.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.6.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.6.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.6.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.6.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.6.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.6.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.6.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.7.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.7.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.7.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.7.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.7.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.7.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.7.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.7.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.7.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.8.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.8.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.8.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.8.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.8.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.8.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.8.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.8.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.8.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.9.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.9.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.9.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.9.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.9.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.9.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.9.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.9.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.9.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.10.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.10.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.10.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.10.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.10.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.10.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.10.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.10.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.10.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.11.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.11.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.11.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.11.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.11.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.11.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.11.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.11.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.11.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.12.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.12.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.12.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.12.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.12.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.12.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.12.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.12.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.12.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.13.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.13.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.13.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.13.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.13.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.13.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.13.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.13.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.13.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.14.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.14.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.14.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.14.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.14.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.14.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.14.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.14.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.14.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.15.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.15.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.15.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.15.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.15.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.15.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.15.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.15.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.15.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.16.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.16.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.16.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.16.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.16.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.16.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.16.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.16.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.16.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.17.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.17.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.17.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.17.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.17.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.17.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.17.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.17.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.17.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.18.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.18.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.18.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.18.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.18.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.18.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.18.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.18.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.18.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.19.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.19.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.19.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.19.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.19.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.19.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.19.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.19.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.19.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.20.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.20.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.20.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.20.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.20.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.20.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.20.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.20.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.20.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.21.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.21.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.21.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.21.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.21.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.21.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.21.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.21.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.21.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.22.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.22.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.22.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.22.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.22.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.22.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.22.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.22.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.22.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.23.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.23.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.23.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.23.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.23.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.23.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.23.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.23.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.23.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.24.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.24.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.24.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.24.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.24.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.24.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.24.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.24.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.24.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.25.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.25.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.25.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.25.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.25.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.25.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.25.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.25.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.25.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.26.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.26.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.26.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.26.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.26.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.26.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.26.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.26.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.26.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.27.self_attn.q_proj.weight: copying a param with shape torch.Size([2048, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.27.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.27.self_attn.v_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.27.self_attn.o_proj.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.27.mlp.gate_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.27.mlp.up_proj.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([14336, 4096]).
size mismatch for model.layers.27.mlp.down_proj.weight: copying a param with shape torch.Size([1024, 3072]) from checkpoint, the shape in current model is torch.Size([4096, 14336]).
size mismatch for model.layers.27.input_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.layers.27.post_attention_layernorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for model.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([4096]).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment