ddh0
/

Q4_K_X.gguf

Model card Files Files and versions

Q4_K_X.gguf / README.md

ddh0's picture

update `attn_output` GGML type

0695cc7 verified 10 months ago

|

history blame contribute delete

519 Bytes

license: unknown

Q4_K_X.gguf

"Q4_K_X" is an unofficial llama.cpp quantization scheme. The GGUF models available in this repo are quantized as follows:

Tensor name	GGML type
`token_embd`	`Q4_K`
`ffn_gate`	`Q4_K`
`ffn_up`	`Q4_K`
`ffn_down`	`Q5_K`
`attn_k`	`Q8_0`
`attn_q`	`Q4_K`
`attn_v`	`Q8_0`
`attn_output`	`Q5_K`
`output`	`Q8_0`