MiniMax-M2.7-APEX-I-Compact.gguf completely broken and outputting gibberish

#3
by EclipseMist - opened

image

This quant is broken I don't have this issue with the unsloth UD IQ4_NL quant. I am running it in llama cpp vulkan so its not the cuda issue.

I have same problem - running MiniMax-M2.7-APEX-I-Mini.gguf
image

Might be an imatrix-specific problem. I tried the Compact variant and it seemed to work, but I-Compact did not

That could be the case. Unfortunately for me, the I-Mini is the only one small enough enough to fit on my machine (Evo-X2 with 96gb ram). Speed-wise im getting really good performance (tg128 on llama-bench gives 39 token/s). I also tried the I-Mini of the Minimax M2.5 and that one works from just fine
image

I have tried llama.cpp compiled both with Vulkan and ROCm, both produce jibberish for M2.7, and appear to work fine for M2.5 (I-Mini).

I will keep digging and let you guys know if i find something

Sign up or log in to comment