Update README.md
Browse files
README.md
CHANGED
|
@@ -14,17 +14,46 @@ Optimized models are published here in [ONNX](https://onnx.ai) format to run wit
|
|
| 14 |
|
| 15 |
To easily get started with the model, you can use our ONNX Runtime Generate() API. See instructions [here](https://github.com/microsoft/onnxruntime/blob/gh-pages/docs/genai/tutorials/deepseek-python.md)
|
| 16 |
|
|
|
|
|
|
|
| 17 |
```bash
|
| 18 |
-
# Download the model directly using the
|
| 19 |
-
huggingface-cli download onnxruntime/DeepSeek-R1-Distill-ONNX --include 'deepseek-r1-distill-qwen-1.5B/*' --local-dir .
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
```
|
| 21 |
|
|
|
|
|
|
|
| 22 |
```bash
|
| 23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
-
#
|
| 26 |
curl -o https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py
|
| 27 |
-
python model-chat.py -m
|
| 28 |
```
|
| 29 |
|
| 30 |
## ONNX Models
|
|
|
|
| 14 |
|
| 15 |
To easily get started with the model, you can use our ONNX Runtime Generate() API. See instructions [here](https://github.com/microsoft/onnxruntime/blob/gh-pages/docs/genai/tutorials/deepseek-python.md)
|
| 16 |
|
| 17 |
+
For CPU:
|
| 18 |
+
|
| 19 |
```bash
|
| 20 |
+
# Download the model directly using the Hugging Face CLI
|
| 21 |
+
huggingface-cli download onnxruntime/DeepSeek-R1-Distill-ONNX --include 'deepseek-r1-distill-qwen-1.5B/cpu_and_mobile/*' --local-dir .
|
| 22 |
+
|
| 23 |
+
# Install the CPU package of ONNX Runtime GenAI
|
| 24 |
+
pip install onnxruntime-genai
|
| 25 |
+
|
| 26 |
+
# Please adjust the model directory (-m) accordingly
|
| 27 |
+
curl -o https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py
|
| 28 |
+
python model-chat.py -m /path/to/cpu-int4-rtn-block-32-acc-level-4/ -e cpu --chat_template "<|begin▁of▁sentence|><|User|>{input}<|Assistant|>"
|
| 29 |
```
|
| 30 |
|
| 31 |
+
For CUDA:
|
| 32 |
+
|
| 33 |
```bash
|
| 34 |
+
# Download the model directly using the Hugging Face CLI
|
| 35 |
+
huggingface-cli download onnxruntime/DeepSeek-R1-Distill-ONNX --include 'deepseek-r1-distill-qwen-1.5B/cuda/*' --local-dir .
|
| 36 |
+
|
| 37 |
+
# Install the CUDA package of ONNX Runtime GenAI
|
| 38 |
+
pip install onnxruntime-genai-cuda
|
| 39 |
+
|
| 40 |
+
# Please adjust the model directory (-m) accordingly
|
| 41 |
+
curl -o https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py
|
| 42 |
+
python model-chat.py -m /path/to/cuda-int4-rtn-block-32/ -e cuda --chat_template "<|begin▁of▁sentence|><|User|>{input}<|Assistant|>"
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
For DirectML:
|
| 46 |
+
|
| 47 |
+
```bash
|
| 48 |
+
# Download the model directly using the Hugging Face CLI
|
| 49 |
+
huggingface-cli download onnxruntime/DeepSeek-R1-Distill-ONNX --include 'deepseek-r1-distill-qwen-1.5B/directml/*' --local-dir .
|
| 50 |
+
|
| 51 |
+
# Install the DirectML package of ONNX Runtime GenAI
|
| 52 |
+
pip install onnxruntime-genai-directml
|
| 53 |
|
| 54 |
+
# Please adjust the model directory (-m) accordingly
|
| 55 |
curl -o https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py
|
| 56 |
+
python model-chat.py -m /path/to/directml-int4-rtn-block-32/ -e dml --chat_template "<|begin▁of▁sentence|><|User|>{input}<|Assistant|>"
|
| 57 |
```
|
| 58 |
|
| 59 |
## ONNX Models
|