warnold-nv
/

glm-4.7-modelopt-fp8

Model card Files Files and versions

glm-4.7-modelopt-fp8 / hf_quant_config.json

warnold-nv's picture

Add files using upload-large-folder tool

ed14abf verified 7 days ago

history blame contribute delete

291 Bytes

	{
	"producer": {
	"name": "modelopt",
	"version": "0.42.0rc1.dev9+ge53ca61b7"
	},
	"quantization": {
	"quant_algo": "FP8",
	"kv_cache_quant_algo": "FP8",
	"exclude_modules": [
	"lm_head",
	"model.layers.92*"
	]
	}
	}