cmh commited on
Commit
1357581
·
verified ·
1 Parent(s): ba4d074

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -12,7 +12,8 @@ pipeline_tag: text-generation
12
 
13
  - [Output and embed tensors quantized to q8_0, all other tensors quantized for q4_k.](https://huggingface.co/RobertSinclair)
14
  - [Output and embed tensors quantized to bf16, all other tensors quantized for q5_k, q6_k, q8_0 and q8_0 --pure.](https://huggingface.co/RobertSinclair)
15
-
 
16
  ```
17
  python convert_hf_to_gguf.py --outtype bf16 phi-4 --outfile phi-4.bf16.gguf
18
 
@@ -32,6 +33,7 @@ llama-quantize --allow-requantize --pure phi-4.bf16.gguf phi-4.bf16.q8_p.gguf q8
32
  | [phi-4.bf16.q6.im](https://huggingface.co/cmh/test/blob/main/phi-4.bf16.q6.im.gguf) | 6.00 bits per weight | 13.2 GB | **15.5 GB** |
33
  | [phi-4.bf16.q8](https://huggingface.co/cmh/test/blob/main/phi-4.bf16.q8.gguf) | 8.00 bits per weight | 16.5 GB | **18.5 GB** |
34
  | [phi-4.bf16.q8_p](https://huggingface.co/cmh/test/blob/main/phi-4.bf16.q8_p.gguf) | 8.00 bits per weight | 15.6 GB | **18.6 GB** |
 
35
 
36
  <sub>*approximate value at 16k context, FP16 cache.<sup>
37
 
 
12
 
13
  - [Output and embed tensors quantized to q8_0, all other tensors quantized for q4_k.](https://huggingface.co/RobertSinclair)
14
  - [Output and embed tensors quantized to bf16, all other tensors quantized for q5_k, q6_k, q8_0 and q8_0 --pure.](https://huggingface.co/RobertSinclair)
15
+ - IMatrix q5_k, q6_k
16
+ - BF16
17
  ```
18
  python convert_hf_to_gguf.py --outtype bf16 phi-4 --outfile phi-4.bf16.gguf
19
 
 
33
  | [phi-4.bf16.q6.im](https://huggingface.co/cmh/test/blob/main/phi-4.bf16.q6.im.gguf) | 6.00 bits per weight | 13.2 GB | **15.5 GB** |
34
  | [phi-4.bf16.q8](https://huggingface.co/cmh/test/blob/main/phi-4.bf16.q8.gguf) | 8.00 bits per weight | 16.5 GB | **18.5 GB** |
35
  | [phi-4.bf16.q8_p](https://huggingface.co/cmh/test/blob/main/phi-4.bf16.q8_p.gguf) | 8.00 bits per weight | 15.6 GB | **18.6 GB** |
36
+ | [phi-4.bf16](https://huggingface.co/cmh/test/blob/main/phi-4.bf16.gguf) | 16.00 bits per weight | 29.3 | |
37
 
38
  <sub>*approximate value at 16k context, FP16 cache.<sup>
39