TheBloke
/

WizardCoder-Guanaco-15B-V1.0-GGML

Model card Files Files and versions

TheBloke commited on Jul 9, 2023

Commit

c2d4b19

·

1 Parent(s): 4fd7ab4

Update README.md

Files changed (1) hide show

README.md +9 -3

README.md CHANGED Viewed

@@ -1,6 +1,13 @@
 ---
 inference: false
-license: other
 ---
 <!-- header start -->
@@ -26,7 +33,7 @@ Please note that these GGMLs are **not compatible with llama.cpp, or currently w
 ## Repositories available
 * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardCoder-Guanaco-15B-V1.0-GPTQ)
-* [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/WizardCoder-Guanaco-15B-V1.0-GGML)
 * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/LoupGarou/WizardCoder-Guanaco-15B-V1.0)
 ## Prompt template: Alpaca
@@ -69,7 +76,6 @@ As other options become available I will endeavour to update them here (do let m
 | wizardcoder-guanaco-15b-v1.0.ggmlv1.q5_1.bin | q5_1 | 5 | 14.26 GB| 16.76 GB | 5-bit. Even higher accuracy, resource usage and slower inference. |
 | wizardcoder-guanaco-15b-v1.0.ggmlv1.q8_0.bin | q8_0 | 8 | 20.11 GB| 22.61 GB | 8-bit. Almost indistinguishable from float16. High resource use and slow. Not recommended for most users. |
 **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
 <!-- footer start -->

 ---
 inference: false
+language:
+- en
+datasets:
+- guanaco
+model_hub_library:
+- transformers
+license:
+- apache-2.0
 ---
 <!-- header start -->
 ## Repositories available
 * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardCoder-Guanaco-15B-V1.0-GPTQ)
+* [4, 5, and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/WizardCoder-Guanaco-15B-V1.0-GGML)
 * [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/LoupGarou/WizardCoder-Guanaco-15B-V1.0)
 ## Prompt template: Alpaca
 | wizardcoder-guanaco-15b-v1.0.ggmlv1.q5_1.bin | q5_1 | 5 | 14.26 GB| 16.76 GB | 5-bit. Even higher accuracy, resource usage and slower inference. |
 | wizardcoder-guanaco-15b-v1.0.ggmlv1.q8_0.bin | q8_0 | 8 | 20.11 GB| 22.61 GB | 8-bit. Almost indistinguishable from float16. High resource use and slow. Not recommended for most users. |
 **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
 <!-- footer start -->