Model request?
#1
by pathosethoslogos - opened
Just wondering if you take model requests
If you do, please do a 2 bit quant of this! :)
Just wondering if you take model requests
If you do, please do a 2 bit quant of this! :)
Which kind of 2 bit quant are you looking for?
This quantized model is using turboquant-vllm for weight quantization. As far as I know, it only supports TurboQuant 3 bits and 4 bits right now.
If you are using Apple Silicon, you can try MiniMax-M2.7-JANGTQ and MiniMax-M2.7-JANG_2L. They are around 2-bit quant. But it only supports MLX so far. I'm working on a CUDA port of it recently, but it is still not ready.
Which kind of 2 bit quant are you looking for?
AutoRound if possible, if not then AWQ please! 😊