Update README.md
Browse files
README.md
CHANGED
|
@@ -133,11 +133,12 @@ python3 llama.cpp/convert_hf_to_gguf.py /path/to/model/ --outfile Model-BF16
|
|
| 133 |
```
|
| 134 |
|
| 135 |
**Step B: Generate Importance Matrix (iMatrix)**
|
| 136 |
-
Download a calibration dataset (
|
| 137 |
```bash
|
| 138 |
-
|
| 139 |
```
|
| 140 |
-
|
|
|
|
| 141 |
|
| 142 |
**Step C: Quantize with HPC**
|
| 143 |
Execute the re-quantizer with your newly generated BF16 GGUF and iMatrix.
|
|
|
|
| 133 |
```
|
| 134 |
|
| 135 |
**Step B: Generate Importance Matrix (iMatrix)**
|
| 136 |
+
Download a calibration dataset (One is included) and generate the iMatrix:
|
| 137 |
```bash
|
| 138 |
+
python3 /generate_imatrix.py /.gguf /calibration_data.txt -o /imatrix.dat --chunks 10 --verbose
|
| 139 |
```
|
| 140 |
+
Note: The provided imatrix generator is superior to llama imatrix generator and is meant to be used with HPC quantize.
|
| 141 |
+
|
| 142 |
|
| 143 |
**Step C: Quantize with HPC**
|
| 144 |
Execute the re-quantizer with your newly generated BF16 GGUF and iMatrix.
|