Goldkoron commited on
Commit
9a75169
·
verified ·
1 Parent(s): 4ee4c2b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +32 -3
README.md CHANGED
@@ -1,4 +1,33 @@
1
  ---
2
- base_model:
3
- - MiniMaxAI/MiniMax-M2.7
4
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: other
3
+ license_name: modified-mit
4
+ license_link: https://huggingface.co/MiniMaxAI/MiniMax-M2.7/blob/main/LICENSE
5
+ base_model: MiniMaxAI/MiniMax-M2.7
6
+ tags:
7
+ - gguf
8
+ - moe
9
+ - quantized
10
+ - minimax
11
+ ---
12
+
13
+ # MiniMax-M2.7 — Gutenberg Quants
14
+
15
+ Quantizations of [MiniMax-M2.7](https://huggingface.co/MiniMaxAI/MiniMax-M2.7) using the Gutenberg (K_G) quantization strategy.
16
+
17
+ ## Available Quants
18
+
19
+ | Quant | Size | BPW | Mean KLD | Same Top P |
20
+ |-------|------|-----|----------|------------|
21
+ | *Results pending* | | | | |
22
+
23
+ KLD and Same Top P measured against Q6_K expert reference logits (8192 context, 10 chunks).
24
+
25
+ ## Why Gutenberg?
26
+
27
+ Standard quantization applies uniform rules to all tensors. Gutenberg uses KLD sensitivity data to allocate precision where it matters most, upgrading the tensors that have the highest measured impact on output quality while keeping less important tensors at the base level. Critical tensors (top 6 by sensitivity) are locked to Q6_K across all quant levels.
28
+
29
+ The result is significantly better quality than standard quants at the same model size.
30
+
31
+ ## Compatibility
32
+
33
+ Fully compatible with stock llama.cpp, llama-server, LM Studio, and any GGUF-compatible runtime. No custom builds required.