is Imatrix better than the regular Quants?

#1436

by Noire1 - opened Oct 3, 2025

Discussion

Noire1

Oct 3, 2025

i'm curious about it, I see two GGUF models one had the "i" for the Imatrix, but confused on which I should use.

nicoboss

Oct 3, 2025

Weighted/imatrix quants offer higher quality than static quants at the same model size and resource usage. If unsure always use weighted/imatrix quants. I recommend you consult the quality column at ouer download page linked in all ouer model cards. You can even select diffrent metric like KL divergence, Perplexity, Same token probablity and eval results to check what quant best fits your needs.

Noire1

Oct 4, 2025

Weighted/imatrix quants offer higher quality than static quants at the same model size and resource usage. If unsure always use weighted/imatrix quants. I recommend you consult the quality column at ouer download page linked in all ouer model cards. You can even select diffrent metric like KL divergence, Perplexity, Same token probablity and eval results to check what quant best fits your needs.

Thank you so much for the info.

eleius

Oct 7, 2025

i'm curious about it, I see two GGUF models one had the "i" for the Imatrix, but confused on which I should use.

Also see these benchmarks https://huggingface.co/mradermacher/BabyHercules-4x150M-GGUF/discussions/2#674a7958ce9bc37b8e33cf55

Zuzus

Oct 30, 2025

•

edited Oct 30, 2025

I was also curious about this and I think Imatrix might not be that good... when doing quantization you need a dataset, you use this dataset to "guide" the model and tell it what to give more attention when making quants which will "preserve" this specific data or written style when compressing.
So let's say a model is made specifically for roleplaying and story writing, if the person doing the quantization only has a dataset from wikipedia, news or technical books for example, the model won't perform as expected compared to the classic static quantization.
Unless the person has a dataset completely made from story book.. the model behavior will be robotic and more logical rather than creative, will lose emotional tone and lacks a good sequence of dialogues, which is expected from a Roleplay/Storywriter model.

Noire1

Nov 1, 2025

I was also curious about this and I think Imatrix might not be that good... when doing quantization you need a dataset, you use this dataset to "guide" the model and tell it what to give more attention when making quants which will "preserve" this specific data or written style when compressing.
So let's say a model is made specifically for roleplaying and story writing, if the person doing the quantization only has a dataset from wikipedia, news or technical books for example, the model won't perform as expected compared to the classic static quantization.
Unless the person has a dataset completely made from story book.. the model behavior will be robotic and more logical rather than creative, will lose emotional tone and lacks a good sequence of dialogues, which is expected from a Roleplay/Storywriter model.

This is interesting,

.
@nicoboss @eleius Is this how it works?

nicoboss

Nov 1, 2025

•

edited Nov 1, 2025

Is this how it works?

@Noire1 No not at all. The imatrix dataset is only used to measure what wights in the models are important and so should be quantized in higher precision. The imatrix dataset does not in any way change the knowledge, behavior or writing style of the model. It does the exact opposite by trying to find a way to quantize the model to any desired size while keeping it as close to the original as possible. As you can see in the KL-divergence, same token probability and top token probability measurements weighted/imatrix quants are far closer to the original unquantised model compared to static quants. This also applies to use cases and even languages not present in the imatrix dataset. Even training an important matrix with random tokens will result in weighted/imatrix quants superior to same sized static quants (someone even wrote a paper about it).

Regarding the question what data should be included inside an imatrix dataset please read the discussion I had about this exact topic last week: https://huggingface.co/mradermacher/model_requests/discussions/1470

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment