kaitchup
/

GLM-4.7-Flash-FP8-Dynamic

compressed-tensors

Model card Files Files and versions

GLM-4.7-Flash-FP8-Dynamic / README.md

bnjmnmarie's picture

Upload folder using huggingface_hub

231c5f2 verified 10 days ago

|

history blame contribute delete

964 Bytes

	---
	license: mit
	base_model:
	- zai-org/GLM-4.7-Flash
	tags:
	- llm-compressor
	---

	<div align="center">
	<img
	src="https://cdn-uploads.huggingface.co/production/uploads/64b93e6bd6c468ac7536607e/mj6xac74jHGLqymiovObc.png"
	alt="The Kaitchup -- AI on a Budget"
	style="width: 100%; max-width: 100%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;"
	/>
	<div style="display: flex; justify-content: center; gap: 0.5em; margin-bottom: 1em;">
	<a href="https://kaitchup.substack.com/subscribe"><strong>Subscribe and Support</strong></a>
	</div>
	</div>


	This is [zai-org/GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash) quantized with [llm-compressor](https://github.com/vllm-project/llm-compressor) to FP8. The model is compatible with vLLM (tested: v0.14.0). Tested with an L4 (Google Colab).

	- Developed by: [The Kaitchup](https://kaitchup.substack.com/)
	- License: lfm1.0