Error Dequantizing model.layers.0.mlp.down_proj.qweight

#6
by chenda7 - opened

Hi,
I'm trying to dequantize a model using the AutoAWQ dequantize code , but I get the following error when processing model.layers.0.mlp.down_proj.qweight:

File "/path/to/packing_utils.py", line 100, in dequantize_gemm
    iweight = (iweight - izeros) * scales
RuntimeError: The size of tensor a (18432) must match the size of tensor b (36864) at non-singleton dimension 0

It seems like there's a mismatch in tensor dimensions during the computation (iweight - izeros) * scales. Could you help me understand why this is happening and how to fix it?

Thanks!

chenda7 changed discussion status to closed

Sign up or log in to comment