qaihm-bot commited on
Commit
ee4059d
·
verified ·
1 Parent(s): ee51f7d

See https://github.com/qualcomm/ai-hub-models/releases/v0.53.0 for changelog.

Files changed (2) hide show
  1. README.md +79 -61
  2. release_assets.json +5 -5
README.md CHANGED
@@ -27,9 +27,9 @@ Below are pre-exported model assets ready for deployment.
27
 
28
  | Runtime | Precision | Chipset | SDK Versions | Download |
29
  |---|---|---|---|---|
30
- | ONNX | float | Universal | QAIRT 2.42, ONNX Runtime 1.24.3 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.52.0/distil_whisper-onnx-float.zip)
31
- | QNN_DLC | float | Universal | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.52.0/distil_whisper-qnn_dlc-float.zip)
32
- | TFLITE | float | Universal | | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.52.0/distil_whisper-tflite-float.zip)
33
 
34
  For more device-specific assets and performance metrics, visit **[Distil-Whisper on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/distil_whisper)**.
35
 
@@ -61,64 +61,82 @@ See our repository for [Distil-Whisper on GitHub](https://github.com/qualcomm/ai
61
  ## Performance Summary
62
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
63
  |---|---|---|---|---|---|---
64
- | decoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 5.551 ms | 52 - 375 MB | NPU
65
- | decoder | ONNX | float | Snapdragon® X2 Elite | 5.091 ms | 178 - 178 MB | NPU
66
- | decoder | ONNX | float | Snapdragon® X Elite | 11.199 ms | 178 - 178 MB | NPU
67
- | decoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 8.561 ms | 52 - 425 MB | NPU
68
- | decoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 11.796 ms | 0 - 183 MB | NPU
69
- | decoder | ONNX | float | Qualcomm® QCS9075 | 13.139 ms | 40 - 82 MB | NPU
70
- | decoder | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 7.192 ms | 17 - 474 MB | NPU
71
- | decoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 5.704 ms | 1 - 533 MB | NPU
72
- | decoder | QNN_DLC | float | Snapdragon® X2 Elite | 5.695 ms | 40 - 40 MB | NPU
73
- | decoder | QNN_DLC | float | Snapdragon® X Elite | 10.955 ms | 40 - 40 MB | NPU
74
- | decoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 8.82 ms | 40 - 482 MB | NPU
75
- | decoder | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 19.382 ms | 15 - 356 MB | NPU
76
- | decoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 11.66 ms | 40 - 42 MB | NPU
77
- | decoder | QNN_DLC | float | Qualcomm® SA8775P | 12.928 ms | 30 - 371 MB | NPU
78
- | decoder | QNN_DLC | float | Qualcomm® QCS9075 | 19.874 ms | 40 - 86 MB | NPU
79
- | decoder | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 17.985 ms | 40 - 339 MB | NPU
80
- | decoder | QNN_DLC | float | Qualcomm® SA7255P | 19.382 ms | 15 - 356 MB | NPU
81
- | decoder | QNN_DLC | float | Qualcomm® SA8295P | 13.861 ms | 22 - 271 MB | NPU
82
- | decoder | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 7.221 ms | 0 - 543 MB | NPU
83
- | decoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 5.821 ms | 4 - 570 MB | NPU
84
- | decoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 8.612 ms | 4 - 749 MB | NPU
85
- | decoder | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 19.362 ms | 4 - 536 MB | NPU
86
- | decoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 11.539 ms | 5 - 7 MB | NPU
87
- | decoder | TFLITE | float | Qualcomm® SA8775P | 13.113 ms | 5 - 537 MB | NPU
88
- | decoder | TFLITE | float | Qualcomm® QCS9075 | 16.341 ms | 0 - 265 MB | NPU
89
- | decoder | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 18.213 ms | 5 - 467 MB | NPU
90
- | decoder | TFLITE | float | Qualcomm® SA7255P | 19.362 ms | 4 - 536 MB | NPU
91
- | decoder | TFLITE | float | Qualcomm® SA8295P | 14.198 ms | 5 - 296 MB | NPU
92
- | decoder | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 7.224 ms | 3 - 572 MB | NPU
93
- | encoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 49.631 ms | 79 - 838 MB | NPU
94
- | encoder | ONNX | float | Snapdragon® X2 Elite | 50.6 ms | 183 - 183 MB | NPU
95
- | encoder | ONNX | float | Snapdragon® X Elite | 123.678 ms | 182 - 182 MB | NPU
96
- | encoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 82.161 ms | 0 - 1151 MB | NPU
97
- | encoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 117.267 ms | 0 - 207 MB | NPU
98
- | encoder | ONNX | float | Qualcomm® QCS9075 | 151.048 ms | 80 - 83 MB | NPU
99
- | encoder | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 60.444 ms | 81 - 735 MB | NPU
100
- | encoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 57.722 ms | 1 - 712 MB | NPU
101
- | encoder | QNN_DLC | float | Snapdragon® X2 Elite | 59.958 ms | 1 - 1 MB | NPU
102
- | encoder | QNN_DLC | float | Snapdragon® X Elite | 139.329 ms | 1 - 1 MB | NPU
103
- | encoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 97.23 ms | 1 - 966 MB | NPU
104
- | encoder | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 437.368 ms | 1 - 695 MB | NPU
105
- | encoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 135.623 ms | 0 - 7 MB | NPU
106
- | encoder | QNN_DLC | float | Qualcomm® SA8775P | 153.483 ms | 1 - 687 MB | NPU
107
- | encoder | QNN_DLC | float | Qualcomm® QCS9075 | 170.356 ms | 1 - 39 MB | NPU
108
- | encoder | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 271.874 ms | 1 - 825 MB | NPU
109
- | encoder | QNN_DLC | float | Qualcomm® SA7255P | 437.368 ms | 1 - 695 MB | NPU
110
- | encoder | QNN_DLC | float | Qualcomm® SA8295P | 192.977 ms | 1 - 610 MB | NPU
111
- | encoder | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 71.316 ms | 1 - 691 MB | NPU
112
- | encoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 415.94 ms | 42 - 84 MB | GPU
113
- | encoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 478.786 ms | 39 - 186 MB | GPU
114
- | encoder | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 3138.884 ms | 32 - 76 MB | GPU
115
- | encoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 652.905 ms | 0 - 299 MB | GPU
116
- | encoder | TFLITE | float | Qualcomm® SA8775P | 1317.563 ms | 25 - 69 MB | GPU
117
- | encoder | TFLITE | float | Qualcomm® QCS9075 | 1270.38 ms | 0 - 40 MB | GPU
118
- | encoder | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 850.25 ms | 40 - 192 MB | GPU
119
- | encoder | TFLITE | float | Qualcomm® SA7255P | 3138.884 ms | 32 - 76 MB | GPU
120
- | encoder | TFLITE | float | Qualcomm® SA8295P | 668.036 ms | 40 - 83 MB | GPU
121
- | encoder | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 407.723 ms | 42 - 81 MB | GPU
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
122
 
123
  ## License
124
  * The license for the original implementation of Distil-Whisper can be found
 
27
 
28
  | Runtime | Precision | Chipset | SDK Versions | Download |
29
  |---|---|---|---|---|
30
+ | ONNX | float | Universal | QAIRT 2.42, ONNX Runtime 1.24.3 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.53.0/distil_whisper-onnx-float.zip)
31
+ | QNN_DLC | float | Universal | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.53.0/distil_whisper-qnn_dlc-float.zip)
32
+ | TFLITE | float | Universal | | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.53.0/distil_whisper-tflite-float.zip)
33
 
34
  For more device-specific assets and performance metrics, visit **[Distil-Whisper on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/distil_whisper)**.
35
 
 
61
  ## Performance Summary
62
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
63
  |---|---|---|---|---|---|---
64
+ | decoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 5.546 ms | 52 - 375 MB | NPU
65
+ | decoder | ONNX | float | Snapdragon® 8 Elite Mobile | 7.178 ms | 16 - 472 MB | NPU
66
+ | decoder | ONNX | float | Snapdragon® X2 Elite | 5.083 ms | 178 - 178 MB | NPU
67
+ | decoder | ONNX | float | Snapdragon® X Elite | 11.41 ms | 178 - 178 MB | NPU
68
+ | decoder | ONNX | float | Snapdragon® X Elite | 11.41 ms | 178 - 178 MB | NPU
69
+ | decoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 8.67 ms | 52 - 426 MB | NPU
70
+ | decoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 11.746 ms | 0 - 184 MB | NPU
71
+ | decoder | ONNX | float | Qualcomm® QCS9075 | 13.188 ms | 40 - 82 MB | NPU
72
+ | decoder | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 7.178 ms | 16 - 472 MB | NPU
73
+ | decoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 5.92 ms | 5 - 504 MB | NPU
74
+ | decoder | QNN_DLC | float | Snapdragon® 8 Elite Mobile | 7.295 ms | 5 - 548 MB | NPU
75
+ | decoder | QNN_DLC | float | Snapdragon® X2 Elite | 6.086 ms | 40 - 40 MB | NPU
76
+ | decoder | QNN_DLC | float | Snapdragon® X Elite | 10.901 ms | 40 - 40 MB | NPU
77
+ | decoder | QNN_DLC | float | Snapdragon® X Elite | 10.901 ms | 40 - 40 MB | NPU
78
+ | decoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 8.613 ms | 0 - 601 MB | NPU
79
+ | decoder | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 19.248 ms | 28 - 525 MB | NPU
80
+ | decoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 11.477 ms | 40 - 43 MB | NPU
81
+ | decoder | QNN_DLC | float | Qualcomm® SA8775P | 12.828 ms | 30 - 522 MB | NPU
82
+ | decoder | QNN_DLC | float | Qualcomm® SA8775P | 12.828 ms | 30 - 522 MB | NPU
83
+ | decoder | QNN_DLC | float | Qualcomm® SA8775P | 12.828 ms | 30 - 522 MB | NPU
84
+ | decoder | QNN_DLC | float | Qualcomm® QCS9075 | 16.542 ms | 40 - 86 MB | NPU
85
+ | decoder | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 18.114 ms | 26 - 330 MB | NPU
86
+ | decoder | QNN_DLC | float | Qualcomm® SA7255P | 19.248 ms | 28 - 525 MB | NPU
87
+ | decoder | QNN_DLC | float | Qualcomm® SA8295P | 14.11 ms | 34 - 276 MB | NPU
88
+ | decoder | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 7.295 ms | 5 - 548 MB | NPU
89
+ | decoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 5.848 ms | 4 - 570 MB | NPU
90
+ | decoder | TFLITE | float | Snapdragon® 8 Elite Mobile | 7.215 ms | 4 - 572 MB | NPU
91
+ | decoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 8.619 ms | 4 - 749 MB | NPU
92
+ | decoder | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 19.275 ms | 4 - 537 MB | NPU
93
+ | decoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 11.713 ms | 5 - 7 MB | NPU
94
+ | decoder | TFLITE | float | Qualcomm® SA8775P | 13.037 ms | 5 - 537 MB | NPU
95
+ | decoder | TFLITE | float | Qualcomm® SA8775P | 13.037 ms | 5 - 537 MB | NPU
96
+ | decoder | TFLITE | float | Qualcomm® SA8775P | 13.037 ms | 5 - 537 MB | NPU
97
+ | decoder | TFLITE | float | Qualcomm® QCS9075 | 16.188 ms | 0 - 265 MB | NPU
98
+ | decoder | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 18.522 ms | 5 - 466 MB | NPU
99
+ | decoder | TFLITE | float | Qualcomm® SA7255P | 19.275 ms | 4 - 537 MB | NPU
100
+ | decoder | TFLITE | float | Qualcomm® SA8295P | 14.231 ms | 5 - 297 MB | NPU
101
+ | decoder | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 7.215 ms | 4 - 572 MB | NPU
102
+ | encoder | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 49.731 ms | 80 - 839 MB | NPU
103
+ | encoder | ONNX | float | Snapdragon® 8 Elite Mobile | 60.575 ms | 80 - 734 MB | NPU
104
+ | encoder | ONNX | float | Snapdragon® X2 Elite | 50.437 ms | 183 - 183 MB | NPU
105
+ | encoder | ONNX | float | Snapdragon® X Elite | 123.445 ms | 182 - 182 MB | NPU
106
+ | encoder | ONNX | float | Snapdragon® X Elite | 123.445 ms | 182 - 182 MB | NPU
107
+ | encoder | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 82.214 ms | 80 - 1239 MB | NPU
108
+ | encoder | ONNX | float | Qualcomm® QCS8550 (Proxy) | 118.816 ms | 0 - 202 MB | NPU
109
+ | encoder | ONNX | float | Qualcomm® QCS9075 | 150.604 ms | 79 - 83 MB | NPU
110
+ | encoder | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 60.575 ms | 80 - 734 MB | NPU
111
+ | encoder | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 58.551 ms | 1 - 712 MB | NPU
112
+ | encoder | QNN_DLC | float | Snapdragon® 8 Elite Mobile | 71.602 ms | 1 - 692 MB | NPU
113
+ | encoder | QNN_DLC | float | Snapdragon® X2 Elite | 60.24 ms | 1 - 1 MB | NPU
114
+ | encoder | QNN_DLC | float | Snapdragon® X Elite | 139.101 ms | 1 - 1 MB | NPU
115
+ | encoder | QNN_DLC | float | Snapdragon® X Elite | 139.101 ms | 1 - 1 MB | NPU
116
+ | encoder | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 97.219 ms | 0 - 962 MB | NPU
117
+ | encoder | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 437.836 ms | 1 - 696 MB | NPU
118
+ | encoder | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 135.253 ms | 0 - 7 MB | NPU
119
+ | encoder | QNN_DLC | float | Qualcomm® SA8775P | 153.502 ms | 1 - 687 MB | NPU
120
+ | encoder | QNN_DLC | float | Qualcomm® SA8775P | 153.502 ms | 1 - 687 MB | NPU
121
+ | encoder | QNN_DLC | float | Qualcomm® SA8775P | 153.502 ms | 1 - 687 MB | NPU
122
+ | encoder | QNN_DLC | float | Qualcomm® QCS9075 | 170.214 ms | 1 - 39 MB | NPU
123
+ | encoder | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 269.206 ms | 0 - 823 MB | NPU
124
+ | encoder | QNN_DLC | float | Qualcomm® SA7255P | 437.836 ms | 1 - 696 MB | NPU
125
+ | encoder | QNN_DLC | float | Qualcomm® SA8295P | 192.976 ms | 1 - 611 MB | NPU
126
+ | encoder | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 71.602 ms | 1 - 692 MB | NPU
127
+ | encoder | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 403.997 ms | 42 - 85 MB | GPU
128
+ | encoder | TFLITE | float | Snapdragon® 8 Elite Mobile | 409.903 ms | 40 - 80 MB | GPU
129
+ | encoder | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 475.908 ms | 42 - 184 MB | GPU
130
+ | encoder | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 3135.306 ms | 24 - 69 MB | GPU
131
+ | encoder | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 657.568 ms | 0 - 318 MB | GPU
132
+ | encoder | TFLITE | float | Qualcomm® SA8775P | 1316.653 ms | 20 - 64 MB | GPU
133
+ | encoder | TFLITE | float | Qualcomm® SA8775P | 1316.653 ms | 20 - 64 MB | GPU
134
+ | encoder | TFLITE | float | Qualcomm® SA8775P | 1316.653 ms | 20 - 64 MB | GPU
135
+ | encoder | TFLITE | float | Qualcomm® QCS9075 | 1271.896 ms | 0 - 40 MB | GPU
136
+ | encoder | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 852.126 ms | 39 - 193 MB | GPU
137
+ | encoder | TFLITE | float | Qualcomm® SA7255P | 3135.306 ms | 24 - 69 MB | GPU
138
+ | encoder | TFLITE | float | Qualcomm® SA8295P | 671.062 ms | 38 - 81 MB | GPU
139
+ | encoder | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 409.903 ms | 40 - 80 MB | GPU
140
 
141
  ## License
142
  * The license for the original implementation of Distil-Whisper can be found
release_assets.json CHANGED
@@ -1,26 +1,26 @@
1
  {
2
- "version": "0.52.0",
3
  "precisions": {
4
  "float": {
5
  "universal_assets": {
6
  "tflite": {
7
  "tool_versions": {
8
- "litert": "1.4.2"
9
  },
10
- "download_url": "https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.52.0/distil_whisper-tflite-float.zip"
11
  },
12
  "qnn_dlc": {
13
  "tool_versions": {
14
  "qairt": "2.45.0.260326154327"
15
  },
16
- "download_url": "https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.52.0/distil_whisper-qnn_dlc-float.zip"
17
  },
18
  "onnx": {
19
  "tool_versions": {
20
  "qairt": "2.42.0.251225135753_193295",
21
  "onnx_runtime": "1.24.3"
22
  },
23
- "download_url": "https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.52.0/distil_whisper-onnx-float.zip"
24
  }
25
  }
26
  }
 
1
  {
2
+ "version": "0.53.0",
3
  "precisions": {
4
  "float": {
5
  "universal_assets": {
6
  "tflite": {
7
  "tool_versions": {
8
+ "litert": "1.4.3"
9
  },
10
+ "download_url": "https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.53.0/distil_whisper-tflite-float.zip"
11
  },
12
  "qnn_dlc": {
13
  "tool_versions": {
14
  "qairt": "2.45.0.260326154327"
15
  },
16
+ "download_url": "https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.53.0/distil_whisper-qnn_dlc-float.zip"
17
  },
18
  "onnx": {
19
  "tool_versions": {
20
  "qairt": "2.42.0.251225135753_193295",
21
  "onnx_runtime": "1.24.3"
22
  },
23
+ "download_url": "https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/distil_whisper/releases/v0.53.0/distil_whisper-onnx-float.zip"
24
  }
25
  }
26
  }