how to quantize this model?

#1
by NEWWWWWbie - opened

Hi,
Can I ask how you guys are quantizing the DeepSeek Distill Qwen 32B model using HIGGS?

I tried using the following config directly, but the output seems broken

model_name = "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
higgs_config = HiggsConfig(
bits=4,
group_size=128,
damp_percent=0.01,
modules_to_not_convert=[],
desc_act=True,
scale_dtype="fp16",
block_name_to_quantize="all",
optimize_target="latency",
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=higgs_config,
torch_dtype=torch.bfloat16,
device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(model_name)

image.png

Sign up or log in to comment