onnxruntime
/

DeepSeek-R1-Distill-ONNX

Text Generation

Model card Files Files and versions

kvaishnavi commited on Feb 12, 2025

Commit

3559324

·

verified ·

1 Parent(s): 492a4b6

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -22,9 +22,10 @@ Here are some of the optimized configurations we have added:
 ## Performance
-The ONNX models are tested on:
-ONNX enables you to run your models on-device across CPU, GPU, NPU. With ONNX you can run your models on any machine across all silica Qualcomm, AMD, Intel, Nvidia. See table below for some key benchmarks for Windows GPU and CPU devices.
 | **Model** | **Precisionl** | **Device Type** | **Execution Provider** | **Device** | **Token Generation Throughput** | **Speed up vs base model**|
 | :------------: | :------------: | :------------: | :------------: | :------------: | :------------: | :------------:|
 | deepseek-ai_DeepSeek-R1-Distill-Qwen-1.5B | ONNX | fp16 |	CUDA | RTX 4090 | 197.195 |	4X   |

 ## Performance
+ONNX enables you to run your models on-device across CPU, GPU, NPU. With ONNX, you can run your models on any machine across all silica (Qualcomm, AMD, Intel, Nvidia, etc).
+See the table below for some key benchmarks for Windows GPU and CPU devices that the ONNX models were tested on.
 | **Model** | **Precisionl** | **Device Type** | **Execution Provider** | **Device** | **Token Generation Throughput** | **Speed up vs base model**|
 | :------------: | :------------: | :------------: | :------------: | :------------: | :------------: | :------------:|
 | deepseek-ai_DeepSeek-R1-Distill-Qwen-1.5B | ONNX | fp16 |	CUDA | RTX 4090 | 197.195 |	4X   |