MiniMax-M2.5 GGUF Benchmarks

#6
by danielhanchen - opened

"Lessons: Models aren’t equally robust, even under otherwise very good quantization algorithms.
Just take Q4, it’ll be fine” is a rule of thumb that doesn’t generalize."

Interestingly, the Unsloth ones especially Q4_K_XL, performs much better than non Unsloth counterparts (despite being 8GB smaller).

Conducted once again by Benjamin Marie: https://x.com/bnjmn_marie/status/2027043753484021810
Guide to run: https://unsloth.ai/docs/models/minimax-m25

HCGBTzgboAASv_A

danielhanchen pinned discussion

looks like minimax doesn't like being quantized

I am using unsloth_MiniMax-M2.5-GGUF_UD-IQ2_M and it is pretty good. Do you know how ollama versions compare?

Unsloth AI org

I am using unsloth_MiniMax-M2.5-GGUF_UD-IQ2_M and it is pretty good. Do you know how ollama versions compare?

Ollama uses standard quantization like LM Studio. No benchmarks were conducted for it at the moment.

Thank you very much for these benchmarks, time consuming to produce I am sure but they paint a more easily interpreted comparison between the quants.
They are much appreciated.

Sign up or log in to comment