Request for GLM 4.6V

#1
by SFPLM - opened

Hi there, thanks for the ~200B MOE LLMs done for NVFP4.

Would like to request if it is possible you could do NVFP4 for GLM 4.6V?

Thanks.

Sure I will look into it tonight after work.

@SFPLM I have attempted to quant GLM-4.6V, but both the vllm compressor and the Nvidia modelopt are failing.

I have written some custom quantization scripts that apply a novel calibration approach that I will work with today and see If I can get it to run without a massive loss in accuracy I am getting with the two "out of the box" methods.

Hi there, thanks for the ~200B MOE LLMs done for NVFP4.

Would like to request if it is possible you could do NVFP4 for GLM 4.6V?

Thanks.

GadflyII/GLM-4.6V-NVFP4

@SFPLM

Sign up or log in to comment