Direct link for IQ5_K

#2
by mayhem4markets - opened

I may be missing something, but I don't see a direct link to download IQ5_K? When I go to https://huggingface.co/ubergarm/MiniMax-M2.5-GGUF/tree/main/IQ5_K and select Use this model and then scroll down to llama.cpp, it links me to the IQ4_XS version. Is there another way to grab it or does something at HF need to be updated? Thanks for your work!

Ahh, so huggingface has some issues with ik_llama.cpp specific quants, but they work fine. Here is how I suggest downloading them:

# pip install huggingface_hub
hf download --local-dir ./MiniMax-M2.5-GGUF/ --include=IQ5_K/*.gguf ubergarm/MiniMax-M2.5-GGUF

Then just pass the first model file in the llama-server ... command and it will work from there!

Let me know if you get stuck or have any issues!

This one is working pretty well so far with opencode as i'm testing while cooking quants haha

Thank you!

Gave it a go, but having an issue. May just be llama.cpp not supporting this quant? I defer to your wisdom.

Used llama-server -m path_to_first_model_GGUF

Error (truncated for efficiency):

gguf_init_from_file_impl: tensor 'blk.0.ffn_gate_exps.weight' has invalid ggml type 140 (NONE)
gguf_init_from_file_impl: failed to read tensor info
llama_model_load: error loading model: llama_model_loader: failed to load GGUF split from llama.cpp/MiniMax-M2.5-GGUF/IQ5_K/MiniMax-M2.5-IQ5_K-00002-of-00005.gguf
llama_model_load_from_file_impl: failed to load model

Despite the error message, I do have all five chunks fully downloaded and there were no errors in the downloads that I saw. I am interested in the quants you are producing as they seem to have very low perplexity scores. Any thoughts? Thanks again for your help and work.

@mayhem4markets

So the IQ5_KS is for ik_llama.cpp only. The only quant I released that works on mainline llama.cpp is the IQ4_XS.

You can see the quickstart here for how to quickly compile ik_llama.cpp which works very similarly to llama.cpp given ik worked on mainline years ago and they are a fork of each other: https://huggingface.co/ubergarm/MiniMax-M2.5-GGUF#quick-start

Sign up or log in to comment