Request for GLM 4.6V
#1
by
SFPLM
- opened
Hi there, thanks for the ~200B MOE LLMs done for NVFP4.
Would like to request if it is possible you could do NVFP4 for GLM 4.6V?
Thanks.
Sure I will look into it tonight after work.
@SFPLM I have attempted to quant GLM-4.6V, but both the vllm compressor and the Nvidia modelopt are failing.
I have written some custom quantization scripts that apply a novel calibration approach that I will work with today and see If I can get it to run without a massive loss in accuracy I am getting with the two "out of the box" methods.