qaihm-bot commited on
Commit
9201d4a
·
verified ·
1 Parent(s): 3bd96da

See https://github.com/qualcomm/ai-hub-models/releases/v0.48.0 for changelog.

Files changed (1) hide show
  1. README.md +59 -60
README.md CHANGED
@@ -15,7 +15,7 @@ pipeline_tag: automatic-speech-recognition
15
  HuggingFace Whisper-Small ASR (Automatic Speech Recognition) model is a state-of-the-art system designed for transcribing spoken language into written text. This model is based on the transformer architecture and has been optimized for edge inference by replacing Multi-Head Attention (MHA) with Single-Head Attention (SHA) and linear layers with convolutional (conv) layers. It exhibits robust performance in realistic, noisy environments, making it highly reliable for real-world applications. Specifically, it excels in long-form transcription, capable of accurately transcribing audio clips up to 30 seconds long. Time to the first token is the encoder's latency, while time to each additional token is decoder's latency, where we assume a max decoded length specified below.
16
 
17
  This is based on the implementation of Whisper-Base found [here](https://github.com/huggingface/transformers/tree/v4.42.3/src/transformers/models/whisper).
18
- This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the [Qualcomm® AI Hub Models](https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/whisper_base) library to export with custom configurations. More details on model performance across various devices, can be found [here](#performance-summary).
19
 
20
  Qualcomm AI Hub Models uses [Qualcomm AI Hub Workbench](https://workbench.aihub.qualcomm.com) to compile, profile, and evaluate this model. [Sign up](https://myaccount.qualcomm.com/signup) to run these models on a hosted Qualcomm® device.
21
 
@@ -28,39 +28,38 @@ Below are pre-exported model assets ready for deployment.
28
 
29
  | Runtime | Precision | Chipset | SDK Versions | Download |
30
  |---|---|---|---|---|
31
- | PRECOMPILED_QNN_ONNX | float | Snapdragon® X Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_x_elite.zip)
32
- | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_8gen3.zip)
33
- | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS8550 (Proxy) | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_qcs8550_proxy.zip)
34
- | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_8_elite_for_galaxy.zip)
35
- | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_8_elite_gen5.zip)
36
- | PRECOMPILED_QNN_ONNX | float | Snapdragon® X2 Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_x2_elite.zip)
37
- | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS9075 | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_qcs9075.zip)
38
- | QNN_CONTEXT_BINARY | float | Snapdragon® X Elite | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_x_elite.zip)
39
- | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_8gen3.zip)
40
- | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8275 (Proxy) | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_qcs8275_proxy.zip)
41
- | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8550 (Proxy) | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_qcs8550_proxy.zip)
42
- | QNN_CONTEXT_BINARY | float | Qualcomm® SA8775P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_sa8775p.zip)
43
- | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_8_elite_for_galaxy.zip)
44
- | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_8_elite_gen5.zip)
45
- | QNN_CONTEXT_BINARY | float | Snapdragon® X2 Elite | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_x2_elite.zip)
46
- | QNN_CONTEXT_BINARY | float | Qualcomm® SA7255P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_sa7255p.zip)
47
- | QNN_CONTEXT_BINARY | float | Qualcomm® SA8295P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_sa8295p.zip)
48
- | QNN_CONTEXT_BINARY | float | Qualcomm® QCS9075 | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_qcs9075.zip)
49
- | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8450 (Proxy) | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.47.0/whisper_base-qnn_context_binary-float-qualcomm_qcs8450_proxy.zip)
50
 
51
  For more device-specific assets and performance metrics, visit **[Whisper-Base on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/whisper_base)**.
52
 
53
 
54
  ### Option 2: Export with Custom Configurations
55
 
56
- Use the [Qualcomm® AI Hub Models](https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/whisper_base) Python library to compile and export the model with your own:
57
  - Custom weights (e.g., fine-tuned checkpoints)
58
  - Custom input shapes
59
  - Target device and runtime configurations
60
 
61
  This option is ideal if you need to customize the model beyond the default configuration provided here.
62
 
63
- See our repository for [Whisper-Base on GitHub](https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/whisper_base) for usage instructions.
64
 
65
  ## Model Details
66
 
@@ -78,44 +77,44 @@ See our repository for [Whisper-Base on GitHub](https://github.com/quic/ai-hub-m
78
  ## Performance Summary
79
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
80
  |---|---|---|---|---|---|---
81
- | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® X Elite | 3.418 ms | 125 - 125 MB | NPU
82
- | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Gen 3 Mobile | 3.275 ms | 3 - 9 MB | NPU
83
- | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS8550 (Proxy) | 4.098 ms | 20 - 22 MB | NPU
84
- | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS9075 | 4.646 ms | 20 - 43 MB | NPU
85
- | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.667 ms | 16 - 23 MB | NPU
86
- | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 2.487 ms | 20 - 31 MB | NPU
87
- | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® X2 Elite | 1.959 ms | 126 - 126 MB | NPU
88
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® X Elite | 3.426 ms | 20 - 20 MB | NPU
89
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Gen 3 Mobile | 3.097 ms | 16 - 24 MB | NPU
90
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8275 (Proxy) | 6.467 ms | 20 - 28 MB | NPU
91
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8550 (Proxy) | 3.93 ms | 19 - 20 MB | NPU
92
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA8775P | 11.273 ms | 20 - 28 MB | NPU
93
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS9075 | 4.658 ms | 20 - 44 MB | NPU
94
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8450 (Proxy) | 5.124 ms | 20 - 31 MB | NPU
95
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA7255P | 6.467 ms | 20 - 28 MB | NPU
96
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA8295P | 5.455 ms | 20 - 25 MB | NPU
97
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.64 ms | 0 - 9 MB | NPU
98
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite Gen 5 Mobile | 2.475 ms | 20 - 30 MB | NPU
99
- | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® X2 Elite | 2.23 ms | 20 - 20 MB | NPU
100
- | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® X Elite | 37.258 ms | 66 - 66 MB | NPU
101
- | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Gen 3 Mobile | 27.146 ms | 39 - 46 MB | NPU
102
- | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS8550 (Proxy) | 36.577 ms | 3 - 52 MB | NPU
103
- | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS9075 | 44.757 ms | 39 - 42 MB | NPU
104
- | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 18.511 ms | 38 - 45 MB | NPU
105
- | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 14.944 ms | 38 - 48 MB | NPU
106
- | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® X2 Elite | 15.497 ms | 65 - 65 MB | NPU
107
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® X Elite | 37.271 ms | 0 - 0 MB | NPU
108
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Gen 3 Mobile | 27.1 ms | 1 - 8 MB | NPU
109
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8275 (Proxy) | 118.922 ms | 1 - 9 MB | NPU
110
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8550 (Proxy) | 36.764 ms | 1 - 2 MB | NPU
111
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA8775P | 197.838 ms | 1 - 10 MB | NPU
112
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS9075 | 44.23 ms | 0 - 20 MB | NPU
113
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8450 (Proxy) | 97.613 ms | 1 - 11 MB | NPU
114
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA7255P | 118.922 ms | 1 - 9 MB | NPU
115
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA8295P | 73.938 ms | 0 - 6 MB | NPU
116
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite For Galaxy Mobile | 18.631 ms | 0 - 14 MB | NPU
117
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite Gen 5 Mobile | 14.547 ms | 1 - 10 MB | NPU
118
- | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® X2 Elite | 15.759 ms | 0 - 0 MB | NPU
119
 
120
  ## License
121
  * The license for the original implementation of Whisper-Base can be found
 
15
  HuggingFace Whisper-Small ASR (Automatic Speech Recognition) model is a state-of-the-art system designed for transcribing spoken language into written text. This model is based on the transformer architecture and has been optimized for edge inference by replacing Multi-Head Attention (MHA) with Single-Head Attention (SHA) and linear layers with convolutional (conv) layers. It exhibits robust performance in realistic, noisy environments, making it highly reliable for real-world applications. Specifically, it excels in long-form transcription, capable of accurately transcribing audio clips up to 30 seconds long. Time to the first token is the encoder's latency, while time to each additional token is decoder's latency, where we assume a max decoded length specified below.
16
 
17
  This is based on the implementation of Whisper-Base found [here](https://github.com/huggingface/transformers/tree/v4.42.3/src/transformers/models/whisper).
18
+ This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/blob/main/qai_hub_models/models/whisper_base) library to export with custom configurations. More details on model performance across various devices, can be found [here](#performance-summary).
19
 
20
  Qualcomm AI Hub Models uses [Qualcomm AI Hub Workbench](https://workbench.aihub.qualcomm.com) to compile, profile, and evaluate this model. [Sign up](https://myaccount.qualcomm.com/signup) to run these models on a hosted Qualcomm® device.
21
 
 
28
 
29
  | Runtime | Precision | Chipset | SDK Versions | Download |
30
  |---|---|---|---|---|
31
+ | PRECOMPILED_QNN_ONNX | float | Snapdragon® X2 Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_x2_elite.zip)
32
+ | PRECOMPILED_QNN_ONNX | float | Snapdragon® X Elite | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_x_elite.zip)
33
+ | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_8gen3.zip)
34
+ | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS8550 (Proxy) | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_qcs8550_proxy.zip)
35
+ | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_8_elite_for_galaxy.zip)
36
+ | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_snapdragon_8_elite_gen5.zip)
37
+ | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS9075 | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-precompiled_qnn_onnx-float-qualcomm_qcs9075.zip)
38
+ | QNN_CONTEXT_BINARY | float | Snapdragon® X2 Elite | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_x2_elite.zip)
39
+ | QNN_CONTEXT_BINARY | float | Snapdragon® X Elite | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_x_elite.zip)
40
+ | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_8gen3.zip)
41
+ | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8550 (Proxy) | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_qcs8550_proxy.zip)
42
+ | QNN_CONTEXT_BINARY | float | Qualcomm® SA8775P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_sa8775p.zip)
43
+ | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_8_elite_for_galaxy.zip)
44
+ | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_snapdragon_8_elite_gen5.zip)
45
+ | QNN_CONTEXT_BINARY | float | Qualcomm® SA7255P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_sa7255p.zip)
46
+ | QNN_CONTEXT_BINARY | float | Qualcomm® SA8295P | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_sa8295p.zip)
47
+ | QNN_CONTEXT_BINARY | float | Qualcomm® QCS9075 | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_qcs9075.zip)
48
+ | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8450 (Proxy) | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/whisper_base/releases/v0.48.0/whisper_base-qnn_context_binary-float-qualcomm_qcs8450_proxy.zip)
 
49
 
50
  For more device-specific assets and performance metrics, visit **[Whisper-Base on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/whisper_base)**.
51
 
52
 
53
  ### Option 2: Export with Custom Configurations
54
 
55
+ Use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/blob/main/qai_hub_models/models/whisper_base) Python library to compile and export the model with your own:
56
  - Custom weights (e.g., fine-tuned checkpoints)
57
  - Custom input shapes
58
  - Target device and runtime configurations
59
 
60
  This option is ideal if you need to customize the model beyond the default configuration provided here.
61
 
62
+ See our repository for [Whisper-Base on GitHub](https://github.com/qualcomm/ai-hub-models/blob/main/qai_hub_models/models/whisper_base) for usage instructions.
63
 
64
  ## Model Details
65
 
 
77
  ## Performance Summary
78
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
79
  |---|---|---|---|---|---|---
80
+ | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® X2 Elite | 1.9 ms | 126 - 126 MB | NPU
81
+ | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® X Elite | 3.47 ms | 125 - 125 MB | NPU
82
+ | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Gen 3 Mobile | 3.117 ms | 0 - 7 MB | NPU
83
+ | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS8550 (Proxy) | 4.038 ms | 20 - 22 MB | NPU
84
+ | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS9075 | 4.632 ms | 20 - 43 MB | NPU
85
+ | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.693 ms | 19 - 26 MB | NPU
86
+ | HfWhisperDecoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 2.483 ms | 20 - 30 MB | NPU
87
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® X2 Elite | 2.283 ms | 20 - 20 MB | NPU
88
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® X Elite | 3.408 ms | 20 - 20 MB | NPU
89
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Gen 3 Mobile | 3.048 ms | 2 - 9 MB | NPU
90
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8275 (Proxy) | 6.532 ms | 20 - 27 MB | NPU
91
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8550 (Proxy) | 3.999 ms | 17 - 19 MB | NPU
92
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA8775P | 4.844 ms | 10 - 18 MB | NPU
93
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS9075 | 4.656 ms | 20 - 44 MB | NPU
94
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8450 (Proxy) | 5.116 ms | 20 - 30 MB | NPU
95
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA7255P | 6.532 ms | 20 - 27 MB | NPU
96
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA8295P | 5.359 ms | 20 - 25 MB | NPU
97
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite For Galaxy Mobile | 2.655 ms | 0 - 10 MB | NPU
98
+ | HfWhisperDecoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite Gen 5 Mobile | 2.482 ms | 20 - 30 MB | NPU
99
+ | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® X2 Elite | 15.634 ms | 64 - 64 MB | NPU
100
+ | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® X Elite | 37.131 ms | 66 - 66 MB | NPU
101
+ | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Gen 3 Mobile | 27.019 ms | 41 - 48 MB | NPU
102
+ | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS8550 (Proxy) | 36.861 ms | 3 - 52 MB | NPU
103
+ | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Qualcomm® QCS9075 | 44.202 ms | 39 - 42 MB | NPU
104
+ | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 18.864 ms | 39 - 50 MB | NPU
105
+ | HfWhisperEncoder | PRECOMPILED_QNN_ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 15.038 ms | 39 - 49 MB | NPU
106
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® X2 Elite | 15.829 ms | 0 - 0 MB | NPU
107
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® X Elite | 37.228 ms | 0 - 0 MB | NPU
108
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Gen 3 Mobile | 27.327 ms | 0 - 8 MB | NPU
109
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8275 (Proxy) | 119.325 ms | 1 - 9 MB | NPU
110
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8550 (Proxy) | 36.574 ms | 1 - 8 MB | NPU
111
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA8775P | 197.767 ms | 1 - 10 MB | NPU
112
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS9075 | 44.299 ms | 0 - 20 MB | NPU
113
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® QCS8450 (Proxy) | 97.845 ms | 1 - 11 MB | NPU
114
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA7255P | 119.325 ms | 1 - 9 MB | NPU
115
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Qualcomm® SA8295P | 73.975 ms | 0 - 5 MB | NPU
116
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite For Galaxy Mobile | 18.34 ms | 1 - 10 MB | NPU
117
+ | HfWhisperEncoder | QNN_CONTEXT_BINARY | float | Snapdragon® 8 Elite Gen 5 Mobile | 14.625 ms | 0 - 11 MB | NPU
118
 
119
  ## License
120
  * The license for the original implementation of Whisper-Base can be found