Subscribe and Support

This is zai-org/GLM-4.7-Flash quantized with llm-compressor to FP8. The model is compatible with vLLM (tested: v0.14.0). Tested with an L4 (Google Colab).

Developed by: The Kaitchup
License: lfm1.0

Downloads last month: 60

Safetensors

Model size

30B params

Tensor type

F32

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kaitchup/GLM-4.7-Flash-FP8-Dynamic

Base model

zai-org/GLM-4.7-Flash

Quantized

(84)

this model