Decided to switch the RAM usage test models from main to dev
Browse filesThe rest of the main models are taking a long time to upload, and they're virtually the same size with the same RAM usage anyway
README.md
CHANGED
|
@@ -23,11 +23,11 @@ inference: false
|
|
| 23 |
**RAM usage:**
|
| 24 |
Model | Startup RAM usage (KoboldCpp) | Startup RAM usage (Oobabooga)
|
| 25 |
:--:|:--:|:--:
|
| 26 |
-
pygmalion-6b-
|
| 27 |
-
pygmalion-6b-
|
| 28 |
-
pygmalion-6b-
|
| 29 |
-
pygmalion-6b-
|
| 30 |
-
pygmalion-6b-
|
| 31 |
|
| 32 |
**Notes:**
|
| 33 |
- ggerganov/ggml [[8ca2c19]](https://github.com/ggerganov/ggml/tree/8ca2c19a3bb8622954d858fbf6383522684eaf34)'s gpt-j conversion script was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
|
|
|
|
| 23 |
**RAM usage:**
|
| 24 |
Model | Startup RAM usage (KoboldCpp) | Startup RAM usage (Oobabooga)
|
| 25 |
:--:|:--:|:--:
|
| 26 |
+
pygmalion-6b-dev.q4_0.bin | 3.7 GiB | 3.7 GiB
|
| 27 |
+
pygmalion-6b-dev.q4_1.bin | 4.1 GiB | 4.1 GiB
|
| 28 |
+
pygmalion-6b-dev.q5_0.bin | 4.4 GiB | 4.4 GiB
|
| 29 |
+
pygmalion-6b-dev.q5_1.bin | 4.8 GiB | 4.8 GiB
|
| 30 |
+
pygmalion-6b-dev.q8_0.bin | 6.5 GiB | 6.6 GiB
|
| 31 |
|
| 32 |
**Notes:**
|
| 33 |
- ggerganov/ggml [[8ca2c19]](https://github.com/ggerganov/ggml/tree/8ca2c19a3bb8622954d858fbf6383522684eaf34)'s gpt-j conversion script was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
|