Quantization aware adapter testing

#30
by jbakerx - opened

Make sure the adapter behaves under common serving setups:

8-bit / 4-bit quantization
different inference kernels
Document “known-good” settings on the model card.

We will consider this enhancement for inclusion in version 2.0.0.

Sign up or log in to comment