Subscribe and Support

Warning: This model has severe quality issues. Use it only for research purpose. You can find verified quantized models of good quality here.

This is zai-org/GLM-4.7-Flash quantized with AutoRound to mixed-precision with a target at 3.5 bits per weight. The model is compatible with vLLM (tested: v0.15.0). Tested with an RTX Pro 6000 WK.

Developed by: The Kaitchup

Downloads last month: 26

Safetensors

Model size

5B params

Tensor type

I32

F16

F32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kaitchup/GLM-4.7-Flash-autoround-3.5bpw

Base model

zai-org/GLM-4.7-Flash

Quantized

(68)

this model