Is desklib able to support quantization ?

#2
by sexyOG - opened

I would like to run desklib locally with a 8GB-memory GPU. Since FP16 is not supported, it is okay to use quantization like using ONNX Runtime? What about bitsandbytes?

Owner

You can try it and see how it affects the accuracy. You can also use CPU based inference if you are not looking to process a lot of data quickly.

desklib changed discussion status to closed

Sign up or log in to comment