NVFP4 Quant

#14
by tuanlai001 - opened

The BF16 model weights are about 128GB. I wonder if NVFP4 quantization can fit on an RTX PRO 6000 Blackwell.

Sign up or log in to comment