Make sure the adapter behaves under common serving setups:
8-bit / 4-bit quantizationdifferent inference kernelsDocument “known-good” settings on the model card.
We will consider this enhancement for inclusion in version 2.0.0.
· Sign up or log in to comment