4-Bit quantizing of this model
#2
by
Jdo300 - opened
Hello,
I recently purchased a license and downloaded a copy of this model to run with llama.cpp. In my case, I need to quantize the model so it will run fast enough on my GPU, but when I use convert.py to perform the conversation, I get this error:
FileNotFoundError: Could not find tokenizer.model in /home/.../models/7B/Mistral-7B-Instruct-v0.1-function-calling-v2 or its parent; if it's in another directory, pass the directory as --vocab-dir
I cloned the entire repo to the folder containing the model files and saw that there is a "tokenizer.json" file there. Is this compatible with the tokenizer.model file that the convert script is looking for? If not, what should I use to properly quantize this model?
Thank you! I was able to load and run the quantized version of the model with llama.cpp and it works great!
RonanMcGovern changed discussion status to
closed