AngelSlim
/

Qwen3-32B_nvfp4

8-bit precision

compressed-tensors

Model card Files Files and versions

Qwen3-32B_nvfp4 / hf_quant_config.json

woodchen7's picture

Upload hf_quant_config.json with huggingface_hub

e07c290 verified 7 months ago

history blame contribute delete

243 Bytes

	{
	"quantization": {
	"exclude_modules": [
	"lm_head",
	"model.embed_tokens",
	"lm_head"
	],
	"kv_cache_quant_algo": null,
	"quant_algo": "NVFP4",
	"group_size": 16
	}
	}