Crataco commited on
Commit
e4e9ca5
·
1 Parent(s): a3008fc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -8
README.md CHANGED
@@ -14,16 +14,25 @@ inference: false
14
  ### This repository contains quantized conversions of the current Pygmalion 6B checkpoints.
15
  *For use with frontends that support GGML quantized GPT-J models, such as KoboldCpp and Oobabooga (with the CTransformers loader).*
16
 
 
 
 
 
 
 
 
17
  Model | Startup RAM usage (KoboldCpp) | Startup RAM usage (Oobabooga)
18
  :--:|:--:|:--:
19
- pygmalion-6b-main.q4_0.bin | Placeholder | Placeholder
20
- pygmalion-6b-main.q4_1.bin | Placeholder | Placeholder
21
- pygmalion-6b-main.q5_0.bin | Placeholder | Placeholder
22
- pygmalion-6b-main.q5_1.bin | Placeholder | Placeholder
23
- pygmalion-6b-main.q8_0.bin | Placeholder | Placeholder
24
- pygmalion-6b-main.f16.bin | Placeholder | Placeholder
25
-
26
- The original model card can be found below.
 
 
27
 
28
  * * *
29
 
 
14
  ### This repository contains quantized conversions of the current Pygmalion 6B checkpoints.
15
  *For use with frontends that support GGML quantized GPT-J models, such as KoboldCpp and Oobabooga (with the CTransformers loader).*
16
 
17
+ *Last updated on 2023-09-26.*
18
+
19
+ **Description:**
20
+ - The `pygmalion-6b-main` files are quantized from the main branch of Pygmalion 6B from 2023-01-13. Also known as "experiment 2".
21
+ - The `pygmalion-6b-dev` files are quantized from the dev branch of Pygmalion 6B from 2023-03-12. Also known as "part 4/10 of experiment 7".
22
+
23
+ **RAM usage:**
24
  Model | Startup RAM usage (KoboldCpp) | Startup RAM usage (Oobabooga)
25
  :--:|:--:|:--:
26
+ pygmalion-6b-main.q4_0.bin | 3.7 GiB | 3.7 GiB
27
+ pygmalion-6b-main.q4_1.bin | 4.1 GiB | 4.1 GiB
28
+ pygmalion-6b-main.q5_0.bin | 4.4 GiB | 4.4 GiB
29
+ pygmalion-6b-main.q5_1.bin | 4.8 GiB | 4.8 GiB
30
+ pygmalion-6b-main.q8_0.bin | 6.5 GiB | 6.6 GiB
31
+
32
+ **Notes:**
33
+ - ggerganov/ggml [[8ca2c19]](https://github.com/ggerganov/ggml/tree/8ca2c19a3bb8622954d858fbf6383522684eaf34)'s gpt-j conversion script was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
34
+
35
+ The original model can be found [here](https://huggingface.co/PygmalionAI/pygmalion-6b), and the original model card can be found below.
36
 
37
  * * *
38