Barrrrry
/

DeepSeek-R1-W4AFP8

8-bit precision

Model card Files Files and versions

DeepSeek-R1-W4AFP8 / README.md

Barrrrry's picture

Update README.md

157b649 verified 9 months ago

|

history blame contribute delete

259 Bytes

	---
	license: mit
	base_model:
	- deepseek-ai/DeepSeek-R1
	base_model_relation: quantized
	---
	# DeepSeek-R1-W4AFP8

	This model is a mixed-precision quantized DeepSeek-R1, with dense layer using `FP8_BLOCK_SCALING`, MoE layers uses INT4 weights and FP8 activation.