OPEA
/

DeepSeek-R1-0528-int4-AutoRound

4-bit precision

Model card Files Files and versions

cicdatopea commited on Aug 27, 2025

Commit

99570a8

·

verified ·

1 Parent(s): 4f6d8a7

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
-quantized_model_dir = "OPEA/DeepSeek-R1-0528-int4-ar"
 model = AutoModelForCausalLM.from_pretrained(
     quantized_model_dir,
@@ -213,7 +213,7 @@ python -m vllm.entrypoints.openai.api_server \
   --gpu-memory-utilization 0.97 \
   --dtype float16 \
   --served-model-name deepseek-reasoner \
-  --model Intel/DeepSeek-R1-0528-int4-ar
 ```
 2. Inference using OpenAI-compatible API syntax:

 import torch
+quantized_model_dir = OPEA/DeepSeek-R1-0528-int4-AutoRound"
 model = AutoModelForCausalLM.from_pretrained(
     quantized_model_dir,
   --gpu-memory-utilization 0.97 \
   --dtype float16 \
   --served-model-name deepseek-reasoner \
+  --model OPEA/DeepSeek-R1-0528-int4-AutoRound
 ```
 2. Inference using OpenAI-compatible API syntax: