Update README.md
Browse files
README.md
CHANGED
|
@@ -14,16 +14,25 @@ inference: false
|
|
| 14 |
### This repository contains quantized conversions of the current Pygmalion 6B checkpoints.
|
| 15 |
*For use with frontends that support GGML quantized GPT-J models, such as KoboldCpp and Oobabooga (with the CTransformers loader).*
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
Model | Startup RAM usage (KoboldCpp) | Startup RAM usage (Oobabooga)
|
| 18 |
:--:|:--:|:--:
|
| 19 |
-
pygmalion-6b-main.q4_0.bin |
|
| 20 |
-
pygmalion-6b-main.q4_1.bin |
|
| 21 |
-
pygmalion-6b-main.q5_0.bin |
|
| 22 |
-
pygmalion-6b-main.q5_1.bin |
|
| 23 |
-
pygmalion-6b-main.q8_0.bin |
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
|
|
|
|
|
|
| 27 |
|
| 28 |
* * *
|
| 29 |
|
|
|
|
| 14 |
### This repository contains quantized conversions of the current Pygmalion 6B checkpoints.
|
| 15 |
*For use with frontends that support GGML quantized GPT-J models, such as KoboldCpp and Oobabooga (with the CTransformers loader).*
|
| 16 |
|
| 17 |
+
*Last updated on 2023-09-26.*
|
| 18 |
+
|
| 19 |
+
**Description:**
|
| 20 |
+
- The `pygmalion-6b-main` files are quantized from the main branch of Pygmalion 6B from 2023-01-13. Also known as "experiment 2".
|
| 21 |
+
- The `pygmalion-6b-dev` files are quantized from the dev branch of Pygmalion 6B from 2023-03-12. Also known as "part 4/10 of experiment 7".
|
| 22 |
+
|
| 23 |
+
**RAM usage:**
|
| 24 |
Model | Startup RAM usage (KoboldCpp) | Startup RAM usage (Oobabooga)
|
| 25 |
:--:|:--:|:--:
|
| 26 |
+
pygmalion-6b-main.q4_0.bin | 3.7 GiB | 3.7 GiB
|
| 27 |
+
pygmalion-6b-main.q4_1.bin | 4.1 GiB | 4.1 GiB
|
| 28 |
+
pygmalion-6b-main.q5_0.bin | 4.4 GiB | 4.4 GiB
|
| 29 |
+
pygmalion-6b-main.q5_1.bin | 4.8 GiB | 4.8 GiB
|
| 30 |
+
pygmalion-6b-main.q8_0.bin | 6.5 GiB | 6.6 GiB
|
| 31 |
+
|
| 32 |
+
**Notes:**
|
| 33 |
+
- ggerganov/ggml [[8ca2c19]](https://github.com/ggerganov/ggml/tree/8ca2c19a3bb8622954d858fbf6383522684eaf34)'s gpt-j conversion script was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
|
| 34 |
+
|
| 35 |
+
The original model can be found [here](https://huggingface.co/PygmalionAI/pygmalion-6b), and the original model card can be found below.
|
| 36 |
|
| 37 |
* * *
|
| 38 |
|