File size: 1,906 Bytes
4ee4c2b
9a75169
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68bd47e
 
 
 
 
 
9a75169
 
 
f3cd0e3
 
 
 
 
 
 
 
 
 
9a75169
 
b68e221
9a75169
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
license: other
license_name: modified-mit
license_link: https://huggingface.co/MiniMaxAI/MiniMax-M2.7/blob/main/LICENSE
base_model: MiniMaxAI/MiniMax-M2.7
tags:
  - gguf
  - moe
  - quantized
  - minimax
---

# MiniMax-M2.7 — Gutenberg Quants

Quantizations of [MiniMax-M2.7](https://huggingface.co/MiniMaxAI/MiniMax-M2.7) using the Gutenberg (K_G) quantization strategy.

## Available Quants

| Quant | Size | BPW | Mean KLD | Same Top P |
|-------|------|-----|----------|------------|
| K_G_5.00 | 133.1 GiB | 5.00 | 0.022412 | 92.447% |
| K_G_4.50 | 119.7 GiB | 4.50 | 0.029416 | 91.311% |
| K_G_4.00 | 106.4 GiB | 4.00 | 0.044050 | 89.497% |
| K_G_3.50 | 93.1 GiB | 3.50 | 0.061226 | 87.641% |
| K_G_3.00 | 79.9 GiB | 3.00 | 0.098738 | 84.454% |
| K_G_2.50 | 66.6 GiB | 2.50 | 0.172875 | 80.034% |

KLD and Same Top P measured against Q6_K expert reference logits (8192 context, 10 chunks).

## vs Standard Quants (unsloth)

| Gutenberg | BPW | KLD | Standard (unsloth) | BPW | KLD |
|-----------|-----|-----|--------------------|-----|-----|
| K_G_2.50 | 2.50 | **0.172875** | UD-IQ2_M | 2.45 | 0.191059 |
| K_G_3.00 | 3.00 | **0.098738** | UD-IQ3_XXS | 2.80 | 0.119762 |
| K_G_3.50 | 3.50 | **0.061226** | UD-Q3_K_M | 3.54 | 0.063647 |
| K_G_4.00 | 4.00 | **0.044050** | UD-IQ4_XS | 3.79 | 0.051081 |
| K_G_5.00 | 5.00 | **0.022412** | UD-Q4_K_M | 4.90 | 0.024529 |

## Why Gutenberg?

Standard quantization applies uniform rules to all tensors. Gutenberg uses KLD sensitivity data to allocate precision where it matters most, upgrading the tensors that have the highest measured impact on output quality while keeping less important tensors at the base level.

The result is significantly better quality than standard quants at the same model size.

## Compatibility

Fully compatible with stock llama.cpp, llama-server, LM Studio, and any GGUF-compatible runtime. No custom builds required.