cmh commited on
Commit
a517f8e
·
verified ·
1 Parent(s): a198b4a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -15,12 +15,12 @@ pipeline_tag: text-generation
15
 
16
  | Filename | Quant type | File Size | ~Vram*|
17
  | -------- | ---------- | --------- | -------- |
18
- | [phi-4_3bpw](https://huggingface.co/cmh/phi-4_exl3/tree/3bpw) | 3.00 bits per weight | tbd GB | **tbd GB** |
19
- | [phi-4_4bpw](https://huggingface.co/cmh/phi-4_exl3/tree/4bpw) | 4.00 bits per weight | tbd GB | **tbd GB** |
20
- | [phi-4_5bpw](https://huggingface.co/cmh/phi-4_exl3/tree/5bpw) | 5.00 bits per weight | tbd GB | **tbd GB** |
21
- | [phi-4_6bpw](https://huggingface.co/cmh/phi-4_exl3/tree/6bpw) | 6.00 bits per weight | tbd GB | **tbd GB** |
22
- | [phi-4_7bpw](https://huggingface.co/cmh/phi-4_exl3/tree/7bpw) | 7.00 bits per weight | tbd GB | **tbd GB** |
23
- | [phi-4_8bpw](https://huggingface.co/cmh/phi-4_exl3/tree/8bpw) | 8.00 bits per weight | tbd GB | **tbd GB** |
24
 
25
  <sub>*approximate value at 16k context, FP16 cache.<sup>
26
 
@@ -55,4 +55,4 @@ How should I explain the Internet?<|im_end|>
55
 
56
  ### With exllamav3's chat.py:
57
 
58
- python examples\chat.py -m models\phi-4_exl3\4bpw -mode phi
 
15
 
16
  | Filename | Quant type | File Size | ~Vram*|
17
  | -------- | ---------- | --------- | -------- |
18
+ | [phi-4_3bpw](https://huggingface.co/cmh/phi-4_exl3/tree/3bpw) | 3.00 bits per weight | 6.08 GB | **7.8 GB** |
19
+ | [phi-4_4bpw](https://huggingface.co/cmh/phi-4_exl3/tree/4bpw) | 4.00 bits per weight | 7.67 GB | **9.4 GB** |
20
+ | [phi-4_5bpw](https://huggingface.co/cmh/phi-4_exl3/tree/5bpw) | 5.00 bits per weight | 9.25 GB | **tbd GB** |
21
+ | [phi-4_6bpw](https://huggingface.co/cmh/phi-4_exl3/tree/6bpw) | 6.00 bits per weight | 10.8 GB | **tbd GB** |
22
+ | [phi-4_7bpw](https://huggingface.co/cmh/phi-4_exl3/tree/7bpw) | 7.00 bits per weight | 12.4 GB | **tbd GB** |
23
+ | [phi-4_8bpw](https://huggingface.co/cmh/phi-4_exl3/tree/8bpw) | 8.00 bits per weight | 14.0 GB | **tbd GB** |
24
 
25
  <sub>*approximate value at 16k context, FP16 cache.<sup>
26
 
 
55
 
56
  ### With exllamav3's chat.py:
57
 
58
+ python examples\chat.py -m models\phi-4_exl3\4bpw -mode raw