是否计划量化deepseek-v3.1/deepseek-v3.2模型并上传?
#13 opened 3 days ago
by
verigle
can support ignore layers in w8a8_int8 quantization setting?
#12 opened 8 months ago
by
jgfly
Can I run this model on AMD GPU? Or is it only compatible for Nvidia GPU?
#11 opened 9 months ago
by
luciagan
Update inference/bf16_cast_channel_int8.py
#10 opened 10 months ago
by
HandH1998
Update config.json
#9 opened 10 months ago
by
HandH1998
how to achieve 2500 tps throughput?
#8 opened 10 months ago
by
muziyongshixin
can this model run with `ollama` with `pure cpu` model?
#7 opened 10 months ago
by
ice6
Add `quantization_config` in config.json?
4
#4 opened 10 months ago
by
WeiwenXia
运行channel INT8后sglang报错OOM
1
#3 opened 10 months ago
by
zhangneilc