Hi, any pointers on how to quantize this model to int8 weight-only precision?

#1
by tanvij - opened

I'm looking into quantizing this model to int8 precision and I'm wondering if I should manually quantize the weights or use an automated technique like AWQ or bitsandbytes. Any recommendations on which method works best for this model? Thanks!

Sign up or log in to comment