spicyneuron commited on
Commit
dc77505
·
verified ·
1 Parent(s): 6154b8c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -29,7 +29,7 @@ uvx --from mlx-lm mlx_lm.server \
29
 
30
  # Methodology
31
 
32
- Quantized using a custom script inspired by Unsloth/AesSedai/ubergarm style mixed-precision GGUFs.
33
  MLX quantization options differ than llama.cpp, but the principles are the same:
34
 
35
  - Sensitive layers like MoE routing, attention, and output embeddings get higher precision.
 
29
 
30
  # Methodology
31
 
32
+ Quantized with a [mlx-lm fork](https://github.com/ml-explore/mlx-lm/pull/922), drawing inspiration from Unsloth/AesSedai/ubergarm style mixed-precision GGUFs.
33
  MLX quantization options differ than llama.cpp, but the principles are the same:
34
 
35
  - Sensitive layers like MoE routing, attention, and output embeddings get higher precision.