nvidia
/

DeepSeek-V3.1-NVFP4

Text Generation

Model Optimizer

8-bit precision

Model card Files Files and versions

chenjiel commited on 22 days ago

Commit

e9c215a

·

verified ·

1 Parent(s): 2219e90

Update README.md

Files changed (1) hide show

README.md +0 -2

README.md CHANGED Viewed

@@ -16,8 +16,6 @@ tags:
 ## Description:
 The NVIDIA DeepSeek-V3.1-FP4 model is the quantized version of the DeepSeek AI's DeepSeek-V3.1 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/deepseek-ai/DeepSeek-V3.1). The NVIDIA DeepSeek V3.1 FP4 model is quantized with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
-Compared to [nvidia/DeepSeek-V3.1-FP4](https://huggingface.co/nvidia/DeepSeek-V3.1-FP4), this checkpoint additionally quantizes the wo module in attention layers.
 This model is ready for commercial/non-commercial use.  <br>
 ## Third-Party Community Consideration

 ## Description:
 The NVIDIA DeepSeek-V3.1-FP4 model is the quantized version of the DeepSeek AI's DeepSeek-V3.1 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/deepseek-ai/DeepSeek-V3.1). The NVIDIA DeepSeek V3.1 FP4 model is quantized with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
 This model is ready for commercial/non-commercial use.  <br>
 ## Third-Party Community Consideration