Model request?

#1
by pathosethoslogos - opened

Just wondering if you take model requests

If you do, please do a 2 bit quant of this! :)

Just wondering if you take model requests

If you do, please do a 2 bit quant of this! :)

Which kind of 2 bit quant are you looking for?

This quantized model is using turboquant-vllm for weight quantization. As far as I know, it only supports TurboQuant 3 bits and 4 bits right now.

If you are using Apple Silicon, you can try MiniMax-M2.7-JANGTQ and MiniMax-M2.7-JANG_2L. They are around 2-bit quant. But it only supports MLX so far. I'm working on a CUDA port of it recently, but it is still not ready.

Which kind of 2 bit quant are you looking for?

AutoRound if possible, if not then AWQ please! 😊

Sign up or log in to comment