Update README.md
Browse files
README.md
CHANGED
|
@@ -22,9 +22,10 @@ Here are some of the optimized configurations we have added:
|
|
| 22 |
|
| 23 |
|
| 24 |
## Performance
|
| 25 |
-
The ONNX models are tested on:
|
| 26 |
|
| 27 |
-
ONNX enables you to run your models on-device across CPU, GPU, NPU. With ONNX you can run your models on any machine across all silica Qualcomm, AMD, Intel, Nvidia
|
|
|
|
|
|
|
| 28 |
| **Model** | **Precisionl** | **Device Type** | **Execution Provider** | **Device** | **Token Generation Throughput** | **Speed up vs base model**|
|
| 29 |
| :------------: | :------------: | :------------: | :------------: | :------------: | :------------: | :------------:|
|
| 30 |
| deepseek-ai_DeepSeek-R1-Distill-Qwen-1.5B | ONNX | fp16 | CUDA | RTX 4090 | 197.195 | 4X |
|
|
|
|
| 22 |
|
| 23 |
|
| 24 |
## Performance
|
|
|
|
| 25 |
|
| 26 |
+
ONNX enables you to run your models on-device across CPU, GPU, NPU. With ONNX, you can run your models on any machine across all silica (Qualcomm, AMD, Intel, Nvidia, etc).
|
| 27 |
+
|
| 28 |
+
See the table below for some key benchmarks for Windows GPU and CPU devices that the ONNX models were tested on.
|
| 29 |
| **Model** | **Precisionl** | **Device Type** | **Execution Provider** | **Device** | **Token Generation Throughput** | **Speed up vs base model**|
|
| 30 |
| :------------: | :------------: | :------------: | :------------: | :------------: | :------------: | :------------:|
|
| 31 |
| deepseek-ai_DeepSeek-R1-Distill-Qwen-1.5B | ONNX | fp16 | CUDA | RTX 4090 | 197.195 | 4X |
|