Create GGUFs if possible?

#3
by InfernalDread - opened

Hello,

Thank you guys for the work that you do! I was wondering if it would be possible to release various sized GGUF quants for people to run under llamacpp, as it would be a great way to test these models?

Thank you.

I have tried to use a quant made myself for llamacpp, but I had recieved this error in the beginning of model loading, would you guys know of a solution?

llama_model_load: error loading model: missing tensor 'blk.60.attn_norm.weight'

while converting, I did notice that the layers went from 0-59, but llamacpp is oddly expecting an extra layer 60

I had the same issue with the nex-agi/Nex-N2-mini and vibecoded it. So I can't tell you what my agent exactly did but you can try it too to get a working gguf.

I had the same issue with the nex-agi/Nex-N2-mini and vibecoded it. So I can't tell you what my agent exactly did but you can try it too to get a working gguf.

did you have to regenerate the GGUF or just make a patch in the llamacpp project to get your current gguf to work? If its the patch, do you think you can upload your version of llamacpp onto github for me to try as well?

Thank you!

config.json advertises mtp_num_hidden_layers: 1, but this uploaded model does not ship the corresponding MTP (Multi-Token Prediction) tensors. Try to call convert_hf_to_gguf.py with the --no-mtp parameter

Guys, we need IQ3_XXS and IQ4_XS <3

Sign up or log in to comment