Q4_K_X.gguf / README.md
ddh0's picture
update `attn_output` GGML type
0695cc7 verified
metadata
license: unknown

Q4_K_X.gguf

"Q4_K_X" is an unofficial llama.cpp quantization scheme. The GGUF models available in this repo are quantized as follows:

Tensor name GGML type
token_embd Q4_K
ffn_gate Q4_K
ffn_up Q4_K
ffn_down Q5_K
attn_k Q8_0
attn_q Q4_K
attn_v Q8_0
attn_output Q5_K
output Q8_0