Warning: This model has severe quality issues. Use it only for research purpose. You can find verified quantized models of good quality here.
This is zai-org/GLM-4.7-Flash quantized with AutoRound to mixed-precision with a target at 3.5 bits per weight. The model is compatible with vLLM (tested: v0.15.0). Tested with an RTX Pro 6000 WK.
- Developed by: The Kaitchup
- Downloads last month
- 26
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support
Model tree for kaitchup/GLM-4.7-Flash-autoround-3.5bpw
Base model
zai-org/GLM-4.7-Flash