spicyneuron commited on
Commit
7b18195
·
verified ·
1 Parent(s): ffb3bf1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -14,8 +14,8 @@ tags:
14
 
15
  # Methodology
16
 
17
- Quantized using a custom script inspired by Unsloth-style mixed-precision GGUFs. MLX quantization options differ
18
- than llama.cpp, but the principles are the same:
19
  - Sensitive layers like MoE routing, attention, and output embeddings get higher precision
20
  - More tolerant layers like MoE experts get lower precision
21
 
 
14
 
15
  # Methodology
16
 
17
+ Quantized using a custom script inspired by Unsloth/AesSedai/ubergarm style mixed-precision GGUFs.
18
+ MLX quantization options differ than llama.cpp, but the principles are the same:
19
  - Sensitive layers like MoE routing, attention, and output embeddings get higher precision
20
  - More tolerant layers like MoE experts get lower precision
21