deepseek_v3
4-bit precision
gptq
cicdatopea commited on
Commit
99570a8
·
verified ·
1 Parent(s): 4f6d8a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -22,7 +22,7 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
22
 
23
  import torch
24
 
25
- quantized_model_dir = "OPEA/DeepSeek-R1-0528-int4-ar"
26
 
27
  model = AutoModelForCausalLM.from_pretrained(
28
  quantized_model_dir,
@@ -213,7 +213,7 @@ python -m vllm.entrypoints.openai.api_server \
213
  --gpu-memory-utilization 0.97 \
214
  --dtype float16 \
215
  --served-model-name deepseek-reasoner \
216
- --model Intel/DeepSeek-R1-0528-int4-ar
217
  ```
218
 
219
  2. Inference using OpenAI-compatible API syntax:
 
22
 
23
  import torch
24
 
25
+ quantized_model_dir = OPEA/DeepSeek-R1-0528-int4-AutoRound"
26
 
27
  model = AutoModelForCausalLM.from_pretrained(
28
  quantized_model_dir,
 
213
  --gpu-memory-utilization 0.97 \
214
  --dtype float16 \
215
  --served-model-name deepseek-reasoner \
216
+ --model OPEA/DeepSeek-R1-0528-int4-AutoRound
217
  ```
218
 
219
  2. Inference using OpenAI-compatible API syntax: