Update README.md
Browse files
README.md
CHANGED
|
@@ -16,8 +16,6 @@ tags:
|
|
| 16 |
## Description:
|
| 17 |
The NVIDIA DeepSeek-V3.1-FP4 model is the quantized version of the DeepSeek AI's DeepSeek-V3.1 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/deepseek-ai/DeepSeek-V3.1). The NVIDIA DeepSeek V3.1 FP4 model is quantized with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
|
| 18 |
|
| 19 |
-
Compared to [nvidia/DeepSeek-V3.1-FP4](https://huggingface.co/nvidia/DeepSeek-V3.1-FP4), this checkpoint additionally quantizes the wo module in attention layers.
|
| 20 |
-
|
| 21 |
This model is ready for commercial/non-commercial use. <br>
|
| 22 |
|
| 23 |
## Third-Party Community Consideration
|
|
|
|
| 16 |
## Description:
|
| 17 |
The NVIDIA DeepSeek-V3.1-FP4 model is the quantized version of the DeepSeek AI's DeepSeek-V3.1 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/deepseek-ai/DeepSeek-V3.1). The NVIDIA DeepSeek V3.1 FP4 model is quantized with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
|
| 18 |
|
|
|
|
|
|
|
| 19 |
This model is ready for commercial/non-commercial use. <br>
|
| 20 |
|
| 21 |
## Third-Party Community Consideration
|