GGUF quants of moonshotai/Kimi-K2-Instruct

Quantization was performed without imatrix for the purposes comparison and experimentation. Perplixity may be worse than expected due to this naive approach.

Read sample outputs of the quants and my report on them

Name Version
moonshotai/Kimi-K2-Instruct c2fee60e5323
convert_hf_to_gguf.py full-b7588
llama-quantize and llama-gguf-split c8a37980419e

See the original model card here.

Downloads last month
52
GGUF
Model size
1T params
Architecture
deepseek2
Hardware compatibility
Log In to add your hardware

1-bit

2-bit

3-bit

4-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for terribleplan/moonshotai_Kimi-K2-Instruct_GGUF

Quantized
(20)
this model