qaihm-bot commited on
Commit
227c7c8
·
verified ·
1 Parent(s): 7a9eaac

See https://github.com/qualcomm/ai-hub-models/releases/v0.49.1 for changelog.

Files changed (1) hide show
  1. README.md +62 -62
README.md CHANGED
@@ -28,26 +28,26 @@ Below are pre-exported model assets ready for deployment.
28
 
29
  | Runtime | Precision | Chipset | SDK Versions | Download |
30
  |---|---|---|---|---|
31
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X2 Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_x2_elite.zip)
32
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_x_elite.zip)
33
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8gen3.zip)
34
- | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcs8550_proxy.zip)
35
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8_elite_for_galaxy.zip)
36
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_7gen4.zip)
37
- | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8_elite_gen5.zip)
38
- | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcm6690.zip)
39
- | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcs9075.zip)
40
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X2 Elite | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_x2_elite.zip)
41
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_x_elite.zip)
42
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_8gen3.zip)
43
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs8550_proxy.zip)
44
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_sa8775p.zip)
45
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_8_elite_for_galaxy.zip)
46
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_7gen4.zip)
47
- | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_8_elite_gen5.zip)
48
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_sa7255p.zip)
49
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcm6690.zip)
50
- | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.48.0/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs9075.zip)
51
 
52
  For more device-specific assets and performance metrics, visit **[Whisper-Small-Quantized on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/whisper_small_quantized)**.
53
 
@@ -75,48 +75,48 @@ See our repository for [Whisper-Small-Quantized on GitHub](https://github.com/qu
75
  ## Performance Summary
76
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
77
  |---|---|---|---|---|---|---
78
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X2 Elite | 3.84 ms | 185 - 185 MB | NPU
79
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | 8.162 ms | 185 - 185 MB | NPU
80
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 6.332 ms | 38 - 45 MB | NPU
81
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 8.365 ms | 29 - 30 MB | NPU
82
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | 9.108 ms | 24 - 57 MB | NPU
83
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | 30.98 ms | 29 - 38 MB | NPU
84
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 4.78 ms | 25 - 37 MB | NPU
85
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 10.969 ms | 29 - 35 MB | NPU
86
- | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 4.004 ms | 30 - 40 MB | NPU
87
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X2 Elite | 4.233 ms | 30 - 30 MB | NPU
88
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | 7.599 ms | 30 - 30 MB | NPU
89
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | 6.43 ms | 30 - 38 MB | NPU
90
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | 13.592 ms | 29 - 37 MB | NPU
91
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | 8.259 ms | 30 - 32 MB | NPU
92
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | 9.279 ms | 30 - 40 MB | NPU
93
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | 8.925 ms | 25 - 60 MB | NPU
94
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | 30.618 ms | 29 - 36 MB | NPU
95
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | 13.592 ms | 29 - 37 MB | NPU
96
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 4.847 ms | 8 - 17 MB | NPU
97
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | 10.88 ms | 30 - 37 MB | NPU
98
- | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 3.987 ms | 30 - 41 MB | NPU
99
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X2 Elite | 154.62 ms | 127 - 127 MB | NPU
100
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | 263.634 ms | 127 - 127 MB | NPU
101
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 244.786 ms | 56 - 62 MB | NPU
102
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 342.463 ms | 0 - 130 MB | NPU
103
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | 254.265 ms | 63 - 67 MB | NPU
104
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | 4224.409 ms | 2 - 12 MB | NPU
105
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 202.351 ms | 63 - 76 MB | NPU
106
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 460.872 ms | 54 - 65 MB | NPU
107
- | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 176.227 ms | 63 - 73 MB | NPU
108
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X2 Elite | 155.158 ms | 0 - 0 MB | NPU
109
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | 296.792 ms | 0 - 0 MB | NPU
110
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | 269.514 ms | 1 - 8 MB | NPU
111
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | 516.472 ms | 1 - 9 MB | NPU
112
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | 366.644 ms | 1 - 2 MB | NPU
113
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | 311.056 ms | 0 - 9 MB | NPU
114
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | 292.146 ms | 0 - 29 MB | NPU
115
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | 4164.003 ms | 0 - 7 MB | NPU
116
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | 516.472 ms | 1 - 9 MB | NPU
117
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 224.338 ms | 1 - 10 MB | NPU
118
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | 474.795 ms | 1 - 7 MB | NPU
119
- | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 174.107 ms | 1 - 10 MB | NPU
120
 
121
  ## License
122
  * The license for the original implementation of Whisper-Small-Quantized can be found
 
28
 
29
  | Runtime | Precision | Chipset | SDK Versions | Download |
30
  |---|---|---|---|---|
31
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8_elite_gen5.zip)
32
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X2 Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_x2_elite.zip)
33
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_x_elite.zip)
34
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8gen3.zip)
35
+ | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcs8550_proxy.zip)
36
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_8_elite_for_galaxy.zip)
37
+ | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_snapdragon_7gen4.zip)
38
+ | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcm6690.zip)
39
+ | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-precompiled_qnn_onnx-w8a16-qualcomm_qcs9075.zip)
40
+ | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_8_elite_gen5.zip)
41
+ | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X2 Elite | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_x2_elite.zip)
42
+ | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_x_elite.zip)
43
+ | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_8gen3.zip)
44
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs8550_proxy.zip)
45
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_sa8775p.zip)
46
+ | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_8_elite_for_galaxy.zip)
47
+ | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_snapdragon_7gen4.zip)
48
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_sa7255p.zip)
49
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcm6690.zip)
50
+ | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_small_quantized/releases/v0.49.1/whisper_small_quantized-qnn_context_binary-w8a16-qualcomm_qcs9075.zip)
51
 
52
  For more device-specific assets and performance metrics, visit **[Whisper-Small-Quantized on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/whisper_small_quantized)**.
53
 
 
75
  ## Performance Summary
76
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
77
  |---|---|---|---|---|---|---
78
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 4.017 ms | 36 - 46 MB | NPU
79
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X2 Elite | 3.825 ms | 185 - 185 MB | NPU
80
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | 7.953 ms | 185 - 185 MB | NPU
81
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 6.47 ms | 39 - 48 MB | NPU
82
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 8.304 ms | 27 - 29 MB | NPU
83
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | 9.145 ms | 24 - 57 MB | NPU
84
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | 32.401 ms | 28 - 37 MB | NPU
85
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 4.776 ms | 25 - 37 MB | NPU
86
+ | WhisperSmallDecoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 10.88 ms | 30 - 36 MB | NPU
87
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 3.967 ms | 30 - 41 MB | NPU
88
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X2 Elite | 4.249 ms | 30 - 30 MB | NPU
89
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | 7.557 ms | 30 - 30 MB | NPU
90
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | 6.203 ms | 19 - 27 MB | NPU
91
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | 13.659 ms | 12 - 19 MB | NPU
92
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | 8.187 ms | 30 - 32 MB | NPU
93
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | 27.56 ms | 19 - 26 MB | NPU
94
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | 8.936 ms | 25 - 60 MB | NPU
95
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | 30.781 ms | 30 - 37 MB | NPU
96
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | 13.659 ms | 12 - 19 MB | NPU
97
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 4.793 ms | 28 - 41 MB | NPU
98
+ | WhisperSmallDecoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | 10.907 ms | 29 - 36 MB | NPU
99
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 174.548 ms | 63 - 73 MB | NPU
100
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X2 Elite | 155.383 ms | 127 - 127 MB | NPU
101
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® X Elite | 267.039 ms | 127 - 127 MB | NPU
102
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 240.522 ms | 63 - 74 MB | NPU
103
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 386.76 ms | 55 - 58 MB | NPU
104
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCS9075 | 255.083 ms | 63 - 66 MB | NPU
105
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Qualcomm® QCM6690 | 4338.57 ms | 1 - 12 MB | NPU
106
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 200.696 ms | 64 - 75 MB | NPU
107
+ | WhisperSmallEncoderQuantizable | PRECOMPILED_QNN_ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 457.646 ms | 56 - 62 MB | NPU
108
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 173.204 ms | 1 - 11 MB | NPU
109
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X2 Elite | 154.118 ms | 0 - 0 MB | NPU
110
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® X Elite | 295.657 ms | 0 - 0 MB | NPU
111
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Gen 3 Mobile | 267.017 ms | 3 - 10 MB | NPU
112
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8275 (Proxy) | 515.755 ms | 1 - 9 MB | NPU
113
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS8550 (Proxy) | 366.478 ms | 1 - 2 MB | NPU
114
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA8775P | 310.162 ms | 0 - 9 MB | NPU
115
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCS9075 | 292.347 ms | 0 - 29 MB | NPU
116
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® QCM6690 | 4026.592 ms | 0 - 7 MB | NPU
117
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Qualcomm® SA7255P | 515.755 ms | 1 - 9 MB | NPU
118
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 223.174 ms | 1 - 10 MB | NPU
119
+ | WhisperSmallEncoderQuantizable | QNN_CONTEXT_BINARY | w8a16 | Snapdragon® 7 Gen 4 Mobile | 471.739 ms | 1 - 7 MB | NPU
120
 
121
  ## License
122
  * The license for the original implementation of Whisper-Small-Quantized can be found