qaihm-bot commited on
Commit
871e7da
·
verified ·
1 Parent(s): 63368a7

See https://github.com/quic/ai-hub-models/releases/v0.47.0 for changelog.

Files changed (1) hide show
  1. README.md +58 -63
README.md CHANGED
@@ -28,27 +28,22 @@ Below are pre-exported model assets ready for deployment.
28
 
29
  | Runtime | Precision | Chipset | SDK Versions | Download |
30
  |---|---|---|---|---|
31
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | QAIRT 2.37, ONNX Runtime 1.23.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_x_elite.zip)
32
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.37, ONNX Runtime 1.23.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8gen3.zip)
33
- | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS6490 | QAIRT 2.37, ONNX Runtime 1.23.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcs6490.zip)
34
- | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | QAIRT 2.37, ONNX Runtime 1.23.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcs8550_proxy.zip)
35
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.37, ONNX Runtime 1.23.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8_elite_for_galaxy.zip)
36
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | QAIRT 2.37, ONNX Runtime 1.23.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_7gen4.zip)
37
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.37, ONNX Runtime 1.23.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8_elite_gen5.zip)
38
- | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | QAIRT 2.37, ONNX Runtime 1.23.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcm6690.zip)
39
- | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | QAIRT 2.37, ONNX Runtime 1.23.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcs9075.zip)
40
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_x_elite.zip)
41
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_8gen3.zip)
42
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs8275_proxy.zip)
43
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs8550_proxy.zip)
44
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_sa8775p.zip)
45
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_8_elite_for_galaxy.zip)
46
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_7gen4.zip)
47
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_8_elite_gen5.zip)
48
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_sa7255p.zip)
49
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8295P | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_sa8295p.zip)
50
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcm6690.zip)
51
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | QAIRT 2.42 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.46.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs9075.zip)
52
 
53
  For more device-specific assets and performance metrics, visit **[Whisper-Small-Quantized on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/whisper_small_quantized)**.
54
 
@@ -76,48 +71,48 @@ See our repository for [Whisper-Small-Quantized on GitHub](https://github.com/qu
76
  ## Performance Summary
77
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
78
  |---|---|---|---|---|---|---
79
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | 7.808 ms | 186 - 186 MB | NPU
80
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 6.786 ms | 38 - 46 MB | NPU
81
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS6490 | 32.683 ms | 28 - 61 MB | NPU
82
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 8.805 ms | 28 - 29 MB | NPU
83
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | 9.596 ms | 24 - 57 MB | NPU
84
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | 35.89 ms | 29 - 36 MB | NPU
85
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 5.133 ms | 25 - 37 MB | NPU
86
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 12.025 ms | 29 - 36 MB | NPU
87
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 4.337 ms | 30 - 41 MB | NPU
88
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | 7.476 ms | 30 - 30 MB | NPU
89
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | 6.327 ms | 28 - 36 MB | NPU
90
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | 13.427 ms | 27 - 36 MB | NPU
91
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | 8.22 ms | 31 - 32 MB | NPU
92
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | 9.312 ms | 20 - 29 MB | NPU
93
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | 8.909 ms | 25 - 60 MB | NPU
94
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | 30.208 ms | 29 - 36 MB | NPU
95
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | 13.427 ms | 27 - 36 MB | NPU
96
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8295P | 10.107 ms | 24 - 30 MB | NPU
97
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 4.738 ms | 18 - 30 MB | NPU
98
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | 10.84 ms | 29 - 36 MB | NPU
99
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 3.967 ms | 30 - 40 MB | NPU
100
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | 61.6 ms | 107 - 107 MB | NPU
101
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 45.023 ms | 65 - 73 MB | NPU
102
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS6490 | 538.284 ms | 35 - 38 MB | NPU
103
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 62.149 ms | 0 - 113 MB | NPU
104
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | 60.93 ms | 63 - 66 MB | NPU
105
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | 1544.251 ms | 29 - 40 MB | NPU
106
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 36.68 ms | 63 - 71 MB | NPU
107
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 187.439 ms | 63 - 73 MB | NPU
108
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 27.853 ms | 63 - 74 MB | NPU
109
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | 263.986 ms | 0 - 0 MB | NPU
110
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | 247.921 ms | 1 - 8 MB | NPU
111
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | 466.54 ms | 1 - 8 MB | NPU
112
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | 349.909 ms | 1 - 3 MB | NPU
113
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | 273.662 ms | 0 - 9 MB | NPU
114
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | 254.394 ms | 0 - 29 MB | NPU
115
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | 4354.892 ms | 0 - 7 MB | NPU
116
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | 466.54 ms | 1 - 8 MB | NPU
117
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8295P | 1357.741 ms | 26 - 38 MB | NPU
118
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 199.787 ms | 1 - 10 MB | NPU
119
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | 473.501 ms | 0 - 7 MB | NPU
120
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 175.841 ms | 1 - 10 MB | NPU
121
 
122
  ## License
123
  * The license for the original implementation of Whisper-Small-Quantized can be found
 
28
 
29
  | Runtime | Precision | Chipset | SDK Versions | Download |
30
  |---|---|---|---|---|
31
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_x_elite.zip)
32
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8gen3.zip)
33
+ | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcs8550_proxy.zip)
34
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X2 Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_x2_elite.zip)
35
+ | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcm6690.zip)
36
+ | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcs9075.zip)
37
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8_elite_for_galaxy.zip)
38
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_7gen4.zip)
39
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8_elite_gen5.zip)
40
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs8275_proxy.zip)
41
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs8550_proxy.zip)
42
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_sa8775p.zip)
43
+ | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X2 Elite | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_x2_elite.zip)
44
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_sa7255p.zip)
45
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcm6690.zip)
46
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.47.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs9075.zip)
 
 
 
 
 
47
 
48
  For more device-specific assets and performance metrics, visit **[Whisper-Small-Quantized on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/whisper_small_quantized)**.
49
 
 
71
  ## Performance Summary
72
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
73
  |---|---|---|---|---|---|---
74
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | 7.991 ms | 185 - 185 MB | NPU
75
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 6.451 ms | 38 - 50 MB | NPU
76
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 8.335 ms | 28 - 29 MB | NPU
77
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | 9.124 ms | 24 - 57 MB | NPU
78
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | 35.618 ms | 28 - 38 MB | NPU
79
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 4.767 ms | 25 - 37 MB | NPU
80
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 10.888 ms | 28 - 35 MB | NPU
81
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 3.998 ms | 30 - 40 MB | NPU
82
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X2 Elite | 3.718 ms | 186 - 186 MB | NPU
83
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | 7.662 ms | 30 - 30 MB | NPU
84
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | 6.366 ms | 30 - 39 MB | NPU
85
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | 13.482 ms | 29 - 37 MB | NPU
86
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | 8.23 ms | 30 - 33 MB | NPU
87
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | 9.233 ms | 19 - 27 MB | NPU
88
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | 8.915 ms | 25 - 60 MB | NPU
89
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | 31.639 ms | 30 - 37 MB | NPU
90
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | 13.482 ms | 29 - 37 MB | NPU
91
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 4.74 ms | 21 - 35 MB | NPU
92
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | 10.815 ms | 30 - 37 MB | NPU
93
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 3.972 ms | 30 - 40 MB | NPU
94
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X2 Elite | 4.089 ms | 30 - 30 MB | NPU
95
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | 264.922 ms | 127 - 127 MB | NPU
96
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 240.754 ms | 64 - 76 MB | NPU
97
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 334.502 ms | 0 - 129 MB | NPU
98
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | 254.756 ms | 63 - 66 MB | NPU
99
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | 4091.143 ms | 2 - 12 MB | NPU
100
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 198.385 ms | 63 - 73 MB | NPU
101
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 457.801 ms | 56 - 63 MB | NPU
102
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 175.574 ms | 63 - 73 MB | NPU
103
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X2 Elite | 154.538 ms | 127 - 127 MB | NPU
104
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | 294.934 ms | 0 - 0 MB | NPU
105
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | 268.023 ms | 1 - 8 MB | NPU
106
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | 517.1 ms | 1 - 9 MB | NPU
107
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | 358.638 ms | 1 - 2 MB | NPU
108
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | 310.466 ms | 0 - 8 MB | NPU
109
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | 291.599 ms | 0 - 29 MB | NPU
110
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | 4113.051 ms | 0 - 6 MB | NPU
111
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | 517.1 ms | 1 - 9 MB | NPU
112
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 224.015 ms | 1 - 10 MB | NPU
113
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | 467.841 ms | 1 - 7 MB | NPU
114
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 187.818 ms | 1 - 10 MB | NPU
115
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X2 Elite | 153.916 ms | 0 - 0 MB | NPU
116
 
117
  ## License
118
  * The license for the original implementation of Whisper-Small-Quantized can be found