Looking forward to the quantized versions

#1
by Qnibbles - opened

Quantized GGUF Versions would be good, but something that works well on vLLM even more for me, especially NVFP4 to maximize performance on Blackwell RTX PRO. Thanks for the REAPs.

No problem... however The model might have mismatched sensor sizes or something similar within the regard. I'll test loading all variants and then see if I can fix it if they do have a problem.

This could be due to MiniMax having its model set to native fp8. Or due to the pruning process.

Sign up or log in to comment