GLM 4.6 imatrix

#1446

by sokann - opened Oct 13, 2025

Oct 13, 2025

So far I have done plenty of PPL/KLD/custom tests with GLM 4.5 using different imatrix files, and the mradermacher one has the best overall performance. Will be great if you guys can make one for GLM 4.6 as well. Thank you 😁

nicoboss

Oct 13, 2025

Thanks a lot for reminding me. We gladly create quants of GLM 4.6. What an impressive model. It's the perfect size to run using 256 GiB of RAM like I have on my old PC. I completely forgot about GLM 4.6 as it was blocked due to requiring a llama.cpp update and then forgot to queue it once mradermacher updated it.

So far I have done plenty of PPL/KLD/custom tests with GLM 4.5 using different imatrix files, and the mradermacher one has the best overall performance.

Thanks a lot for the comparison. That is awesome to know. So far we only knew that we have a better imatrix dataset than bartowski due to using a twice as large dataset with the first half being his imatrix dataset and the second half being high quality proprietary data filling the gaps of bartowski 's imatrix dataset like story fragments and other secret ingredients not even I know. I never really compared us with the other quanters but I'm glad mradermacher's perfectionism in curating the imatrix dataset made our quants come out on top. It would be cool if you could share your results as I'm really interested.

nicoboss

Oct 13, 2025

It's queued! :D
I realized that I actually queued it back when llama.cpp was updated but missed that I had to mv mtp.safetensors model-mtp.safetensors which is why it failed and I had not time to look into it since. I now redownloaded and manualy convearted it into GGUF and static quants are currently beeing computed.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#GLM-4.6-GGUF for quants to appear.

sokann

Oct 14, 2025

Excellent, thank you so much :)

So far we only knew that we have a better imatrix dataset than bartowski due to using a twice as large dataset with the first half being his imatrix dataset and the second half being high quality proprietary data filling the gaps of bartowski 's imatrix dataset like story fragments and other secret ingredients not even I know.

Interesting, so it is like a bartowski++ version haha

I shared the findings in https://www.reddit.com/r/LocalLLaMA/comments/1mwevt4/comment/na058yk/ (see the new follow-up comment at the bottom)

My eval for deterministic code refactoring turns out to be quite well suited for comparing quantization results 😄

sokann

Oct 15, 2025

Oohhh it is up. Thanks guys ❤️

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment