how to quantize this model?
#1
by
NEWWWWWbie
- opened
Hi,
Can I ask how you guys are quantizing the DeepSeek Distill Qwen 32B model using HIGGS?
I tried using the following config directly, but the output seems broken
model_name = "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
higgs_config = HiggsConfig(
bits=4,
group_size=128,
damp_percent=0.01,
modules_to_not_convert=[],
desc_act=True,
scale_dtype="fp16",
block_name_to_quantize="all",
optimize_target="latency",
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=higgs_config,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
