DLM-2.1-14B-FP8 / recipe.yaml
hkyoo89's picture
Upload FP8 quantized DNA-2.1-14B (static per-tensor W8A8, llm-compressor)
edd1272 verified
raw
history blame contribute delete
168 Bytes
default_stage:
default_modifiers:
QuantizationModifier:
targets: [Linear]
ignore: [lm_head]
scheme: FP8
bypass_divisibility_checks: false