metadata
license: unknown
Q4_K_X.gguf
"Q4_K_X" is an unofficial llama.cpp quantization scheme. The GGUF models available in this repo are quantized as follows:
| Tensor name | GGML type |
|---|---|
token_embd |
Q4_K |
ffn_gate |
Q4_K |
ffn_up |
Q4_K |
ffn_down |
Q5_K |
attn_k |
Q8_0 |
attn_q |
Q4_K |
attn_v |
Q8_0 |
attn_output |
Q5_K |
output |
Q8_0 |