DeepSeek-R1-W4AFP8 / README.md

Update README.md

157b649 verified 9 months ago

259 Bytes

metadata

license: mit
base_model:
  - deepseek-ai/DeepSeek-R1
base_model_relation: quantized

DeepSeek-R1-W4AFP8

This model is a mixed-precision quantized DeepSeek-R1, with dense layer using FP8_BLOCK_SCALING, MoE layers uses INT4 weights and FP8 activation.