qaihm-bot commited on
Commit
3db66f7
·
verified ·
1 Parent(s): ee13d98

See https://github.com/quic/ai-hub-models/releases/v0.45.0 for changelog.

This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. LICENSE +1 -0
  2. README.md +268 -0
  3. precompiled/qualcomm-qcs8275-proxy/Zipformer_ZipformerDecoder_float.bin +3 -0
  4. precompiled/qualcomm-qcs8275-proxy/Zipformer_ZipformerEncoder_float.bin +3 -0
  5. precompiled/qualcomm-qcs8275-proxy/Zipformer_ZipformerJoiner_float.bin +3 -0
  6. precompiled/qualcomm-qcs8275-proxy/tool-versions.yaml +3 -0
  7. precompiled/qualcomm-qcs8450-proxy/Zipformer_ZipformerDecoder_float.bin +3 -0
  8. precompiled/qualcomm-qcs8450-proxy/Zipformer_ZipformerEncoder_float.bin +3 -0
  9. precompiled/qualcomm-qcs8450-proxy/Zipformer_ZipformerJoiner_float.bin +3 -0
  10. precompiled/qualcomm-qcs8450-proxy/tool-versions.yaml +3 -0
  11. precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerDecoder_float.bin +3 -0
  12. precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerDecoder_float.onnx.zip +3 -0
  13. precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerEncoder_float.bin +3 -0
  14. precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerEncoder_float.onnx.zip +3 -0
  15. precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerJoiner_float.bin +3 -0
  16. precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerJoiner_float.onnx.zip +3 -0
  17. precompiled/qualcomm-qcs8550-proxy/tool-versions.yaml +4 -0
  18. precompiled/qualcomm-qcs9075-proxy/Zipformer_ZipformerDecoder_float.bin +3 -0
  19. precompiled/qualcomm-qcs9075-proxy/Zipformer_ZipformerEncoder_float.bin +3 -0
  20. precompiled/qualcomm-qcs9075-proxy/Zipformer_ZipformerJoiner_float.bin +3 -0
  21. precompiled/qualcomm-qcs9075-proxy/tool-versions.yaml +3 -0
  22. precompiled/qualcomm-sa7255p/Zipformer_ZipformerDecoder_float.bin +3 -0
  23. precompiled/qualcomm-sa7255p/Zipformer_ZipformerEncoder_float.bin +3 -0
  24. precompiled/qualcomm-sa7255p/Zipformer_ZipformerJoiner_float.bin +3 -0
  25. precompiled/qualcomm-sa7255p/tool-versions.yaml +3 -0
  26. precompiled/qualcomm-sa8295p/Zipformer_ZipformerDecoder_float.bin +3 -0
  27. precompiled/qualcomm-sa8295p/Zipformer_ZipformerEncoder_float.bin +3 -0
  28. precompiled/qualcomm-sa8295p/Zipformer_ZipformerJoiner_float.bin +3 -0
  29. precompiled/qualcomm-sa8295p/tool-versions.yaml +3 -0
  30. precompiled/qualcomm-sa8775p/Zipformer_ZipformerDecoder_float.bin +3 -0
  31. precompiled/qualcomm-sa8775p/Zipformer_ZipformerEncoder_float.bin +3 -0
  32. precompiled/qualcomm-sa8775p/Zipformer_ZipformerJoiner_float.bin +3 -0
  33. precompiled/qualcomm-sa8775p/tool-versions.yaml +3 -0
  34. precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerDecoder_float.bin +3 -0
  35. precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerDecoder_float.onnx.zip +3 -0
  36. precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerEncoder_float.bin +3 -0
  37. precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerEncoder_float.onnx.zip +3 -0
  38. precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerJoiner_float.bin +3 -0
  39. precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerJoiner_float.onnx.zip +3 -0
  40. precompiled/qualcomm-snapdragon-8-elite-for-galaxy/tool-versions.yaml +4 -0
  41. precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerDecoder_float.bin +3 -0
  42. precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerDecoder_float.onnx.zip +3 -0
  43. precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerEncoder_float.bin +3 -0
  44. precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerEncoder_float.onnx.zip +3 -0
  45. precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerJoiner_float.bin +3 -0
  46. precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerJoiner_float.onnx.zip +3 -0
  47. precompiled/qualcomm-snapdragon-8-elite-gen5/tool-versions.yaml +4 -0
  48. precompiled/qualcomm-snapdragon-8gen3/Zipformer_ZipformerDecoder_float.bin +3 -0
  49. precompiled/qualcomm-snapdragon-8gen3/Zipformer_ZipformerDecoder_float.onnx.zip +3 -0
  50. precompiled/qualcomm-snapdragon-8gen3/Zipformer_ZipformerEncoder_float.bin +3 -0
LICENSE ADDED
@@ -0,0 +1 @@
 
 
1
+ The license of the original trained model can be found at https://github.com/huggingface/transformers/blob/v4.42.3/LICENSE.
README.md ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: pytorch
3
+ license: other
4
+ tags:
5
+ - foundation
6
+ - android
7
+ pipeline_tag: automatic-speech-recognition
8
+
9
+ ---
10
+
11
+ ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/zipformer/web-assets/model_demo.png)
12
+
13
+ # Zipformer: Optimized for Mobile Deployment
14
+ ## Transformer-based automatic speech recognition (ASR) model for English and Chinese language
15
+
16
+
17
+ Zipformer streaming ASR (Automatic Speech Recognition) model is a state-of-the-art system designed for transcribing spoken language into written text streamingly. This model is based on the transformer architecture and has been optimized for edge inference by replacing linear layers with convolutional (conv) layers. It exhibits robust performance in realistic, noisy environments, making it highly reliable for real-world applications. Specifically, it excels in long-form transcription, capable of accurately transcribing audios. Time to the first token is the encoder's latency, while time to each additional token is joiner's latency, where we assume a max decoded length specified below.
18
+
19
+ This model is an implementation of Zipformer found [here](https://github.com/k2-fsa/icefall).
20
+
21
+
22
+ This repository provides scripts to run Zipformer on Qualcomm® devices.
23
+ More details on model performance across various devices, can be found
24
+ [here](https://aihub.qualcomm.com/models/zipformer).
25
+
26
+
27
+
28
+ ### Model Details
29
+
30
+ - **Model Type:** Model_use_case.speech_recognition
31
+ - **Model Stats:**
32
+ - Model checkpoint: pfluo/k2fsa-zipformer-chinese-english-mixed
33
+ - Input resolution: 80x71 (0.71 seconds audio)
34
+ - Max decoded sequence length: 200 tokens
35
+ - Number of parameters (ZipformerEncoder): 63.2M
36
+ - Model size (ZipformerEncoder) (float): 242 MB
37
+ - Number of parameters (ZipformerDecoder): 3.47M
38
+ - Model size (ZipformerDecoder) (float): 13.2 MB
39
+ - Number of parameters (ZipformerJoiner): 3.21M
40
+ - Model size (ZipformerJoiner) (float): 12.2 MB
41
+
42
+ | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
43
+ |---|---|---|---|---|---|---|---|---|
44
+ | ZipformerEncoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 23.393 ms | 3 - 10 MB | NPU | Use Export Script |
45
+ | ZipformerEncoder | float | QCS8450 (Proxy) | Qualcomm® QCS8450 (Proxy) | QNN_CONTEXT_BINARY | 11.889 ms | 2 - 12 MB | NPU | Use Export Script |
46
+ | ZipformerEncoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_CONTEXT_BINARY | 8.84 ms | 3 - 5 MB | NPU | Use Export Script |
47
+ | ZipformerEncoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | PRECOMPILED_QNN_ONNX | 9.665 ms | 10 - 11 MB | NPU | Use Export Script |
48
+ | ZipformerEncoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 41.908 ms | 3 - 11 MB | NPU | Use Export Script |
49
+ | ZipformerEncoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 23.393 ms | 3 - 10 MB | NPU | Use Export Script |
50
+ | ZipformerEncoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 13.206 ms | 0 - 5 MB | NPU | Use Export Script |
51
+ | ZipformerEncoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 41.908 ms | 3 - 11 MB | NPU | Use Export Script |
52
+ | ZipformerEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 6.294 ms | 2 - 11 MB | NPU | Use Export Script |
53
+ | ZipformerEncoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 7.014 ms | 9 - 17 MB | NPU | Use Export Script |
54
+ | ZipformerEncoder | float | Samsung Galaxy S25 | Snapdragon® 8 Elite For Galaxy Mobile | QNN_CONTEXT_BINARY | 5.417 ms | 3 - 15 MB | NPU | Use Export Script |
55
+ | ZipformerEncoder | float | Samsung Galaxy S25 | Snapdragon® 8 Elite For Galaxy Mobile | PRECOMPILED_QNN_ONNX | 5.879 ms | 9 - 21 MB | NPU | Use Export Script |
56
+ | ZipformerEncoder | float | Snapdragon 8 Elite Gen 5 QRD | Snapdragon® 8 Elite Gen 5 Mobile | QNN_CONTEXT_BINARY | 4.835 ms | 3 - 13 MB | NPU | Use Export Script |
57
+ | ZipformerEncoder | float | Snapdragon 8 Elite Gen 5 QRD | Snapdragon® 8 Elite Gen 5 Mobile | PRECOMPILED_QNN_ONNX | 5.367 ms | 10 - 20 MB | NPU | Use Export Script |
58
+ | ZipformerEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 9.385 ms | 3 - 3 MB | NPU | Use Export Script |
59
+ | ZipformerEncoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 9.807 ms | 142 - 142 MB | NPU | Use Export Script |
60
+ | ZipformerDecoder | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 0.302 ms | 0 - 8 MB | NPU | Use Export Script |
61
+ | ZipformerDecoder | float | QCS8450 (Proxy) | Qualcomm® QCS8450 (Proxy) | QNN_CONTEXT_BINARY | 0.098 ms | 0 - 9 MB | NPU | Use Export Script |
62
+ | ZipformerDecoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_CONTEXT_BINARY | 0.073 ms | 0 - 2 MB | NPU | Use Export Script |
63
+ | ZipformerDecoder | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | PRECOMPILED_QNN_ONNX | 0.211 ms | 0 - 8 MB | NPU | Use Export Script |
64
+ | ZipformerDecoder | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 0.583 ms | 0 - 8 MB | NPU | Use Export Script |
65
+ | ZipformerDecoder | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 0.302 ms | 0 - 8 MB | NPU | Use Export Script |
66
+ | ZipformerDecoder | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 0.319 ms | 0 - 5 MB | NPU | Use Export Script |
67
+ | ZipformerDecoder | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 0.583 ms | 0 - 8 MB | NPU | Use Export Script |
68
+ | ZipformerDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 0.055 ms | 0 - 7 MB | NPU | Use Export Script |
69
+ | ZipformerDecoder | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 0.2 ms | 0 - 8 MB | NPU | Use Export Script |
70
+ | ZipformerDecoder | float | Samsung Galaxy S25 | Snapdragon® 8 Elite For Galaxy Mobile | QNN_CONTEXT_BINARY | 0.05 ms | 0 - 9 MB | NPU | Use Export Script |
71
+ | ZipformerDecoder | float | Samsung Galaxy S25 | Snapdragon® 8 Elite For Galaxy Mobile | PRECOMPILED_QNN_ONNX | 0.157 ms | 0 - 12 MB | NPU | Use Export Script |
72
+ | ZipformerDecoder | float | Snapdragon 8 Elite Gen 5 QRD | Snapdragon® 8 Elite Gen 5 Mobile | QNN_CONTEXT_BINARY | 0.045 ms | 0 - 10 MB | NPU | Use Export Script |
73
+ | ZipformerDecoder | float | Snapdragon 8 Elite Gen 5 QRD | Snapdragon® 8 Elite Gen 5 Mobile | PRECOMPILED_QNN_ONNX | 0.155 ms | 0 - 11 MB | NPU | Use Export Script |
74
+ | ZipformerDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 0.151 ms | 0 - 0 MB | NPU | Use Export Script |
75
+ | ZipformerDecoder | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 0.154 ms | 7 - 7 MB | NPU | Use Export Script |
76
+ | ZipformerJoiner | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_CONTEXT_BINARY | 0.465 ms | 0 - 8 MB | NPU | Use Export Script |
77
+ | ZipformerJoiner | float | QCS8450 (Proxy) | Qualcomm® QCS8450 (Proxy) | QNN_CONTEXT_BINARY | 0.231 ms | 0 - 9 MB | NPU | Use Export Script |
78
+ | ZipformerJoiner | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_CONTEXT_BINARY | 0.189 ms | 0 - 1 MB | NPU | Use Export Script |
79
+ | ZipformerJoiner | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | PRECOMPILED_QNN_ONNX | 0.352 ms | 0 - 8 MB | NPU | Use Export Script |
80
+ | ZipformerJoiner | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_CONTEXT_BINARY | 0.328 ms | 0 - 5 MB | NPU | Use Export Script |
81
+ | ZipformerJoiner | float | SA7255P ADP | Qualcomm® SA7255P | QNN_CONTEXT_BINARY | 0.465 ms | 0 - 8 MB | NPU | Use Export Script |
82
+ | ZipformerJoiner | float | SA8295P ADP | Qualcomm® SA8295P | QNN_CONTEXT_BINARY | 0.454 ms | 0 - 5 MB | NPU | Use Export Script |
83
+ | ZipformerJoiner | float | SA8775P ADP | Qualcomm® SA8775P | QNN_CONTEXT_BINARY | 0.328 ms | 0 - 5 MB | NPU | Use Export Script |
84
+ | ZipformerJoiner | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_CONTEXT_BINARY | 0.154 ms | 0 - 8 MB | NPU | Use Export Script |
85
+ | ZipformerJoiner | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | PRECOMPILED_QNN_ONNX | 0.328 ms | 0 - 7 MB | NPU | Use Export Script |
86
+ | ZipformerJoiner | float | Samsung Galaxy S25 | Snapdragon® 8 Elite For Galaxy Mobile | QNN_CONTEXT_BINARY | 0.132 ms | 0 - 9 MB | NPU | Use Export Script |
87
+ | ZipformerJoiner | float | Samsung Galaxy S25 | Snapdragon® 8 Elite For Galaxy Mobile | PRECOMPILED_QNN_ONNX | 0.23 ms | 0 - 7 MB | NPU | Use Export Script |
88
+ | ZipformerJoiner | float | Snapdragon 8 Elite Gen 5 QRD | Snapdragon® 8 Elite Gen 5 Mobile | QNN_CONTEXT_BINARY | 0.13 ms | 0 - 9 MB | NPU | Use Export Script |
89
+ | ZipformerJoiner | float | Snapdragon 8 Elite Gen 5 QRD | Snapdragon® 8 Elite Gen 5 Mobile | PRECOMPILED_QNN_ONNX | 0.226 ms | 0 - 11 MB | NPU | Use Export Script |
90
+ | ZipformerJoiner | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 0.263 ms | 0 - 0 MB | NPU | Use Export Script |
91
+ | ZipformerJoiner | float | Snapdragon X Elite CRD | Snapdragon® X Elite | PRECOMPILED_QNN_ONNX | 0.247 ms | 6 - 6 MB | NPU | Use Export Script |
92
+
93
+
94
+
95
+
96
+ ## Installation
97
+
98
+
99
+ Install the package via pip:
100
+ ```bash
101
+ # NOTE: 3.10 <= PYTHON_VERSION < 3.14 is supported.
102
+ pip install torch==2.9.0+cpu -f https://download.pytorch.org/whl/torch/
103
+ pip install "qai-hub-models[zipformer]" k2==1.24.4.dev20251029+cpu.torch2.9.0 -f https://k2-fsa.github.io/k2/cpu.html
104
+ ```
105
+
106
+
107
+ ## Configure Qualcomm® AI Hub Workbench to run this model on a cloud-hosted device
108
+
109
+ Sign-in to [Qualcomm® AI Hub Workbench](https://workbench.aihub.qualcomm.com/) with your
110
+ Qualcomm® ID. Once signed in navigate to `Account -> Settings -> API Token`.
111
+
112
+ With this API token, you can configure your client to run models on the cloud
113
+ hosted devices.
114
+ ```bash
115
+ qai-hub configure --api_token API_TOKEN
116
+ ```
117
+ Navigate to [docs](https://workbench.aihub.qualcomm.com/docs/) for more information.
118
+
119
+
120
+
121
+ ## Demo off target
122
+
123
+ The package contains a simple end-to-end demo that downloads pre-trained
124
+ weights and runs this model on a sample input.
125
+
126
+ ```bash
127
+ python -m qai_hub_models.models.zipformer.demo
128
+ ```
129
+
130
+ The above demo runs a reference implementation of pre-processing, model
131
+ inference, and post processing.
132
+
133
+ **NOTE**: If you want running in a Jupyter Notebook or Google Colab like
134
+ environment, please add the following to your cell (instead of the above).
135
+ ```
136
+ %run -m qai_hub_models.models.zipformer.demo
137
+ ```
138
+
139
+
140
+ ### Run model on a cloud-hosted device
141
+
142
+ In addition to the demo, you can also run the model on a cloud-hosted Qualcomm®
143
+ device. This script does the following:
144
+ * Performance check on-device on a cloud-hosted device
145
+ * Downloads compiled assets that can be deployed on-device for Android.
146
+ * Accuracy check between PyTorch and on-device outputs.
147
+
148
+ ```bash
149
+ python -m qai_hub_models.models.zipformer.export
150
+ ```
151
+
152
+
153
+
154
+ ## How does this work?
155
+
156
+ This [export script](https://aihub.qualcomm.com/models/zipformer/qai_hub_models/models/Zipformer/export.py)
157
+ leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
158
+ on-device. Lets go through each step below in detail:
159
+
160
+ Step 1: **Compile model for on-device deployment**
161
+
162
+ To compile a PyTorch model for on-device deployment, we first trace the model
163
+ in memory using the `jit.trace` and then call the `submit_compile_job` API.
164
+
165
+ ```python
166
+ import torch
167
+
168
+ import qai_hub as hub
169
+ from qai_hub_models.models.zipformer import Model
170
+
171
+ # Load the model
172
+ torch_model = Model.from_pretrained()
173
+
174
+ # Device
175
+ device = hub.Device("Samsung Galaxy S25")
176
+
177
+ # Trace model
178
+ input_shape = torch_model.get_input_spec()
179
+ sample_inputs = torch_model.sample_inputs()
180
+
181
+ pt_model = torch.jit.trace(torch_model, [torch.tensor(data[0]) for _, data in sample_inputs.items()])
182
+
183
+ # Compile model on a specific device
184
+ compile_job = hub.submit_compile_job(
185
+ model=pt_model,
186
+ device=device,
187
+ input_specs=torch_model.get_input_spec(),
188
+ )
189
+
190
+ # Get target model to run on-device
191
+ target_model = compile_job.get_target_model()
192
+
193
+ ```
194
+
195
+
196
+ Step 2: **Performance profiling on cloud-hosted device**
197
+
198
+ After compiling models from step 1. Models can be profiled model on-device using the
199
+ `target_model`. Note that this scripts runs the model on a device automatically
200
+ provisioned in the cloud. Once the job is submitted, you can navigate to a
201
+ provided job URL to view a variety of on-device performance metrics.
202
+ ```python
203
+ profile_job = hub.submit_profile_job(
204
+ model=target_model,
205
+ device=device,
206
+ )
207
+
208
+ ```
209
+
210
+ Step 3: **Verify on-device accuracy**
211
+
212
+ To verify the accuracy of the model on-device, you can run on-device inference
213
+ on sample input data on the same cloud hosted device.
214
+ ```python
215
+ input_data = torch_model.sample_inputs()
216
+ inference_job = hub.submit_inference_job(
217
+ model=target_model,
218
+ device=device,
219
+ inputs=input_data,
220
+ )
221
+ on_device_output = inference_job.download_output_data()
222
+
223
+ ```
224
+ With the output of the model, you can compute like PSNR, relative errors or
225
+ spot check the output with expected output.
226
+
227
+ **Note**: This on-device profiling and inference requires access to Qualcomm®
228
+ AI Hub Workbench. [Sign up for access](https://myaccount.qualcomm.com/signup).
229
+
230
+
231
+
232
+
233
+ ## Deploying compiled model to Android
234
+
235
+
236
+ The models can be deployed using multiple runtimes:
237
+ - TensorFlow Lite (`.tflite` export): [This
238
+ tutorial](https://www.tensorflow.org/lite/android/quickstart) provides a
239
+ guide to deploy the .tflite model in an Android application.
240
+
241
+
242
+ - QNN (`.so` export ): This [sample
243
+ app](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/sample_app.html)
244
+ provides instructions on how to use the `.so` shared library in an Android application.
245
+
246
+
247
+ ## View on Qualcomm® AI Hub
248
+ Get more details on Zipformer's performance across various devices [here](https://aihub.qualcomm.com/models/zipformer).
249
+ Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
250
+
251
+
252
+ ## License
253
+ * The license for the original implementation of Zipformer can be found
254
+ [here](https://github.com/huggingface/transformers/blob/v4.42.3/LICENSE).
255
+
256
+
257
+
258
+ ## References
259
+ * [Zipformer A faster and better encoder for automatic speech recognition](https://openreview.net/forum?id=9WD9KwssyT)
260
+ * [Source Model Implementation](https://github.com/k2-fsa/icefall)
261
+
262
+
263
+
264
+ ## Community
265
+ * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI.
266
+ * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).
267
+
268
+
precompiled/qualcomm-qcs8275-proxy/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5363413e257201699a5a517e5c05cfd9a5e60ce7af085ae72c52cb8a8f5a768d
3
+ size 7045120
precompiled/qualcomm-qcs8275-proxy/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e7ce06d26dcbafd44e26f3521aa1ccbc0ea69ae44c5802a3ec348813d6c3a9a
3
+ size 151408640
precompiled/qualcomm-qcs8275-proxy/Zipformer_ZipformerJoiner_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9a0c7f430dfad84afe210c6ed4546bc06b5f9c3e96ee27079e33d34c19f4db42
3
+ size 6529024
precompiled/qualcomm-qcs8275-proxy/tool-versions.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ tool_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.41.0.251128145156_191518-auto
precompiled/qualcomm-qcs8450-proxy/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e715304ca5d1ce417fe3ebf09bdc87536f7d588191e338a55bcc58b427706680
3
+ size 7045120
precompiled/qualcomm-qcs8450-proxy/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1f3e02f37dca952c7fe1d2c19c6952aa1b05443e761c559ecd58ed9fd856cb21
3
+ size 150818816
precompiled/qualcomm-qcs8450-proxy/Zipformer_ZipformerJoiner_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b9b7d9f199a0388cb3f9c06e842a593f54d40b78bcb96eec683faf997101e55e
3
+ size 6500352
precompiled/qualcomm-qcs8450-proxy/tool-versions.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ tool_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.41.0.251128145156_191518
precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:375e2e17df6468f302d102605bf96e4d2dafb88dda8c0326b1b658566937ae3d
3
+ size 7045120
precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerDecoder_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:77ac8aa72661e27eeec75623e9d79e0050f16a38b6b8bc55e0264a7fb7adcbe4
3
+ size 6501936
precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:13ed7d514daec2f6d6ba279ca05c665ab78c8174e9297a8c874009085c70e709
3
+ size 151408640
precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerEncoder_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e2f45c2daa00177397145225c002b4683bdcb2f9c78c7f13dfbbded51bd33d3
3
+ size 118423850
precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerJoiner_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:123494094260a4e75518df48c335cdd5c405e0e9499090ed6584075d6442a040
3
+ size 6529024
precompiled/qualcomm-qcs8550-proxy/Zipformer_ZipformerJoiner_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e16e47f35a6969b8c6202deadea239c261370822e59b3c56f4201d2c934d6e46
3
+ size 5776036
precompiled/qualcomm-qcs8550-proxy/tool-versions.yaml ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ tool_versions:
2
+ precompiled_qnn_onnx:
3
+ qairt: 2.37.1.250807093845_124904
4
+ onnx_runtime: 1.23.0
precompiled/qualcomm-qcs9075-proxy/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f8b0bc321754d064fd41e62589532b29288f6cbd095ba64787e66b8cfc2e9e1
3
+ size 7045120
precompiled/qualcomm-qcs9075-proxy/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e7d689cea00cb723269f4513baf6f3202fb5f58154d32026d227d5fe56447304
3
+ size 151408640
precompiled/qualcomm-qcs9075-proxy/Zipformer_ZipformerJoiner_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0bcdbd48d58b024ffd9bc27b8c04314d9e10de4dbb48ffbd0bd0b7fa430ca19d
3
+ size 6529024
precompiled/qualcomm-qcs9075-proxy/tool-versions.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ tool_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.41.0.251128145156_191518-auto
precompiled/qualcomm-sa7255p/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5363413e257201699a5a517e5c05cfd9a5e60ce7af085ae72c52cb8a8f5a768d
3
+ size 7045120
precompiled/qualcomm-sa7255p/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e7ce06d26dcbafd44e26f3521aa1ccbc0ea69ae44c5802a3ec348813d6c3a9a
3
+ size 151408640
precompiled/qualcomm-sa7255p/Zipformer_ZipformerJoiner_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9a0c7f430dfad84afe210c6ed4546bc06b5f9c3e96ee27079e33d34c19f4db42
3
+ size 6529024
precompiled/qualcomm-sa7255p/tool-versions.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ tool_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.41.0.251128145156_191518-auto
precompiled/qualcomm-sa8295p/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5cf069b6d9b4a5e233d6eda478d32ff5aba8f552a15fe92caca379b0721c45e2
3
+ size 7045120
precompiled/qualcomm-sa8295p/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:011384fd6798c84ac99c6894fb4d8cb94337a85bd4523833925a11676b982892
3
+ size 150794240
precompiled/qualcomm-sa8295p/Zipformer_ZipformerJoiner_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:35210c56260ee29b365a13b0e1b68cf0d3bb04e98d888cc7c4778730aada4d15
3
+ size 6500352
precompiled/qualcomm-sa8295p/tool-versions.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ tool_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.41.0.251128145156_191518-auto
precompiled/qualcomm-sa8775p/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f8b0bc321754d064fd41e62589532b29288f6cbd095ba64787e66b8cfc2e9e1
3
+ size 7045120
precompiled/qualcomm-sa8775p/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e7d689cea00cb723269f4513baf6f3202fb5f58154d32026d227d5fe56447304
3
+ size 151408640
precompiled/qualcomm-sa8775p/Zipformer_ZipformerJoiner_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0bcdbd48d58b024ffd9bc27b8c04314d9e10de4dbb48ffbd0bd0b7fa430ca19d
3
+ size 6529024
precompiled/qualcomm-sa8775p/tool-versions.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ tool_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.41.0.251128145156_191518-auto
precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:79024f5d6ea102d1b5a9dc3ef5152571e32ac0d9bc6d06dbf5bc50ac1a99f5aa
3
+ size 7045120
precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerDecoder_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3fb2b946ef3304467d6bf4afd95f7a5f348975d83d0080e7001c6ff9a42d7e5a
3
+ size 6501931
precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:14c778faddd4584bcb432c01f919e8e1f87571396fca730b15027f049888853a
3
+ size 151379968
precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerEncoder_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bf1f35d67a152bddc6c0d6cb898f87088cc17c003ce4e93e622f53abc6f8265f
3
+ size 118412396
precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerJoiner_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3733e7e5fb4623c12dfb0de06baa0a7386af67faadaa1aba251e1dc4b94e63c2
3
+ size 6529024
precompiled/qualcomm-snapdragon-8-elite-for-galaxy/Zipformer_ZipformerJoiner_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:21519e0df17a95012d77a3eae04622bd2740bb27d8306026328976bb66ffecaa
3
+ size 5776047
precompiled/qualcomm-snapdragon-8-elite-for-galaxy/tool-versions.yaml ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ tool_versions:
2
+ precompiled_qnn_onnx:
3
+ qairt: 2.37.1.250807093845_124904
4
+ onnx_runtime: 1.23.0
precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8583097b1d1367f0314d598a06bce244ee9243e3da845af3d842f8fa58817c59
3
+ size 7045120
precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerDecoder_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:29e5a6fef628d2e46b6f634f919abfc3650fe058d9c229f06d401fe3fecf4ca0
3
+ size 6502284
precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db20f0464d17f88b7b9d05b5b519253a85e5090b725a737ac39cdb58b3a67537
3
+ size 151760896
precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerEncoder_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8296211b87b5e7e37160437c3a656b3e2788c8119a09b8014c7af37f68e9f69f
3
+ size 118521474
precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerJoiner_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8300e75134f7634694a7dd4e89e05aad4110470acfcfdf3997f1c5cad8fa8201
3
+ size 6533120
precompiled/qualcomm-snapdragon-8-elite-gen5/Zipformer_ZipformerJoiner_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:928d702b609926dfc586da647f9d52b97cedde9529b7f58a6ab19d5642ba3cf4
3
+ size 5777070
precompiled/qualcomm-snapdragon-8-elite-gen5/tool-versions.yaml ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ tool_versions:
2
+ precompiled_qnn_onnx:
3
+ qairt: 2.37.1.250807093845_124904
4
+ onnx_runtime: 1.23.0
precompiled/qualcomm-snapdragon-8gen3/Zipformer_ZipformerDecoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:df7cfc2c3a450cce4c75dc85f9e30bf09362223a64b256bd214608b1141e2ce1
3
+ size 7045120
precompiled/qualcomm-snapdragon-8gen3/Zipformer_ZipformerDecoder_float.onnx.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d58a4e0c407be5352f522e23112dfa9b392944f6ad4bf4cd22e7d4ea548e0c8
3
+ size 6501930
precompiled/qualcomm-snapdragon-8gen3/Zipformer_ZipformerEncoder_float.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:929972502ee35e5cff8b166a239203ea4dbf963ad248f97c94b602086010bb65
3
+ size 151408640