opt-6b-4-bit / README.md

Update README.md

34bf60d almost 3 years ago

589 Bytes

metadata

license: other

Quantized for the older GPTQ before it broke all the models.

Use with

Clone the 2 repos into text-generation-webui-testing/repositories

python cuda_setup.py install inside GPTQ-Merged to compile nvidia kernel.

python server.py --cai-chat --gptq-bits 4 --model opt-6b --autograd

You only need the json files, don't forget merges.txt