qaihm-bot commited on
Commit
72f4787
·
verified ·
1 Parent(s): de88828

See https://github.com/quic/ai-hub-models/releases/v0.34.0 for changelog.

README.md CHANGED
@@ -23,6 +23,7 @@ More details on model performance across various devices, can be found
23
  [here](https://aihub.qualcomm.com/models/nomic_embed_text).
24
 
25
 
 
26
  ### Model Details
27
 
28
  - **Model Type:** Model_use_case.text_generation
@@ -34,31 +35,31 @@ More details on model performance across various devices, can be found
34
 
35
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
36
  |---|---|---|---|---|---|---|---|---|
37
- | Nomic-Embed-Text | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | TFLITE | 34.133 ms | 0 - 424 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
38
  | Nomic-Embed-Text | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_DLC | 28.718 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
39
- | Nomic-Embed-Text | float | QCS8450 (Proxy) | Qualcomm® QCS8450 (Proxy) | TFLITE | 12.109 ms | 0 - 409 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
40
  | Nomic-Embed-Text | float | QCS8450 (Proxy) | Qualcomm® QCS8450 (Proxy) | QNN_DLC | 10.755 ms | 0 - 374 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
41
- | Nomic-Embed-Text | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | TFLITE | 10.288 ms | 0 - 29 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
42
  | Nomic-Embed-Text | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 7.625 ms | 0 - 28 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
43
- | Nomic-Embed-Text | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | TFLITE | 12.437 ms | 0 - 424 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
44
  | Nomic-Embed-Text | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_DLC | 9.863 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
45
- | Nomic-Embed-Text | float | SA7255P ADP | Qualcomm® SA7255P | TFLITE | 34.133 ms | 0 - 424 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
46
  | Nomic-Embed-Text | float | SA7255P ADP | Qualcomm® SA7255P | QNN_DLC | 28.718 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
47
- | Nomic-Embed-Text | float | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | TFLITE | 10.335 ms | 0 - 31 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
48
  | Nomic-Embed-Text | float | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN_DLC | 7.667 ms | 0 - 30 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
49
- | Nomic-Embed-Text | float | SA8295P ADP | Qualcomm® SA8295P | TFLITE | 13.945 ms | 0 - 397 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
50
  | Nomic-Embed-Text | float | SA8295P ADP | Qualcomm® SA8295P | QNN_DLC | 10.791 ms | 0 - 359 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
51
- | Nomic-Embed-Text | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | TFLITE | 10.39 ms | 0 - 30 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
52
  | Nomic-Embed-Text | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN_DLC | 7.745 ms | 0 - 26 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
53
- | Nomic-Embed-Text | float | SA8775P ADP | Qualcomm® SA8775P | TFLITE | 12.437 ms | 0 - 424 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
54
  | Nomic-Embed-Text | float | SA8775P ADP | Qualcomm® SA8775P | QNN_DLC | 9.863 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
55
- | Nomic-Embed-Text | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | TFLITE | 10.265 ms | 0 - 24 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
56
  | Nomic-Embed-Text | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 7.65 ms | 0 - 25 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
57
  | Nomic-Embed-Text | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 8.007 ms | 0 - 22 MB | NPU | [Nomic-Embed-Text.onnx](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.onnx) |
58
- | Nomic-Embed-Text | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | TFLITE | 7.335 ms | 0 - 434 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
59
  | Nomic-Embed-Text | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 5.416 ms | 0 - 374 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
60
  | Nomic-Embed-Text | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 5.547 ms | 0 - 377 MB | NPU | [Nomic-Embed-Text.onnx](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.onnx) |
61
- | Nomic-Embed-Text | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | TFLITE | 6.118 ms | 0 - 420 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
62
  | Nomic-Embed-Text | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_DLC | 5.189 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
63
  | Nomic-Embed-Text | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 5.472 ms | 0 - 369 MB | NPU | [Nomic-Embed-Text.onnx](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.onnx) |
64
  | Nomic-Embed-Text | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_DLC | 9.347 ms | 1580 - 1580 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
@@ -120,17 +121,7 @@ device. This script does the following:
120
  ```bash
121
  python -m qai_hub_models.models.nomic_embed_text.export
122
  ```
123
- ```
124
- Profiling Results
125
- ------------------------------------------------------------
126
- Nomic-Embed-Text
127
- Device : cs_8275 (ANDROID 14)
128
- Runtime : TFLITE
129
- Estimated inference time (ms) : 34.1
130
- Estimated peak memory usage (MB): [0, 424]
131
- Total # Ops : 906
132
- Compute Unit(s) : npu (906 ops) gpu (0 ops) cpu (0 ops)
133
- ```
134
 
135
 
136
  ## How does this work?
 
23
  [here](https://aihub.qualcomm.com/models/nomic_embed_text).
24
 
25
 
26
+
27
  ### Model Details
28
 
29
  - **Model Type:** Model_use_case.text_generation
 
35
 
36
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
37
  |---|---|---|---|---|---|---|---|---|
38
+ | Nomic-Embed-Text | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | TFLITE | 186.656 ms | 0 - 424 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
39
  | Nomic-Embed-Text | float | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN_DLC | 28.718 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
40
+ | Nomic-Embed-Text | float | QCS8450 (Proxy) | Qualcomm® QCS8450 (Proxy) | TFLITE | 12.239 ms | 0 - 410 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
41
  | Nomic-Embed-Text | float | QCS8450 (Proxy) | Qualcomm® QCS8450 (Proxy) | QNN_DLC | 10.755 ms | 0 - 374 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
42
+ | Nomic-Embed-Text | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | TFLITE | 9.849 ms | 3 - 34 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
43
  | Nomic-Embed-Text | float | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 7.625 ms | 0 - 28 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
44
+ | Nomic-Embed-Text | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | TFLITE | 12.405 ms | 0 - 424 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
45
  | Nomic-Embed-Text | float | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN_DLC | 9.863 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
46
+ | Nomic-Embed-Text | float | SA7255P ADP | Qualcomm® SA7255P | TFLITE | 186.656 ms | 0 - 424 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
47
  | Nomic-Embed-Text | float | SA7255P ADP | Qualcomm® SA7255P | QNN_DLC | 28.718 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
48
+ | Nomic-Embed-Text | float | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | TFLITE | 10.253 ms | 0 - 24 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
49
  | Nomic-Embed-Text | float | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN_DLC | 7.667 ms | 0 - 30 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
50
+ | Nomic-Embed-Text | float | SA8295P ADP | Qualcomm® SA8295P | TFLITE | 14.01 ms | 0 - 397 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
51
  | Nomic-Embed-Text | float | SA8295P ADP | Qualcomm® SA8295P | QNN_DLC | 10.791 ms | 0 - 359 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
52
+ | Nomic-Embed-Text | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | TFLITE | 10.225 ms | 0 - 20 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
53
  | Nomic-Embed-Text | float | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN_DLC | 7.745 ms | 0 - 26 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
54
+ | Nomic-Embed-Text | float | SA8775P ADP | Qualcomm® SA8775P | TFLITE | 12.405 ms | 0 - 424 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
55
  | Nomic-Embed-Text | float | SA8775P ADP | Qualcomm® SA8775P | QNN_DLC | 9.863 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
56
+ | Nomic-Embed-Text | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | TFLITE | 10.36 ms | 0 - 29 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
57
  | Nomic-Embed-Text | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 7.65 ms | 0 - 25 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
58
  | Nomic-Embed-Text | float | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 8.007 ms | 0 - 22 MB | NPU | [Nomic-Embed-Text.onnx](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.onnx) |
59
+ | Nomic-Embed-Text | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | TFLITE | 7.333 ms | 0 - 433 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
60
  | Nomic-Embed-Text | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 5.416 ms | 0 - 374 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
61
  | Nomic-Embed-Text | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 5.547 ms | 0 - 377 MB | NPU | [Nomic-Embed-Text.onnx](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.onnx) |
62
+ | Nomic-Embed-Text | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | TFLITE | 7.39 ms | 0 - 417 MB | NPU | [Nomic-Embed-Text.tflite](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.tflite) |
63
  | Nomic-Embed-Text | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN_DLC | 5.189 ms | 0 - 367 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
64
  | Nomic-Embed-Text | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 5.472 ms | 0 - 369 MB | NPU | [Nomic-Embed-Text.onnx](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.onnx) |
65
  | Nomic-Embed-Text | float | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_DLC | 9.347 ms | 1580 - 1580 MB | NPU | [Nomic-Embed-Text.dlc](https://huggingface.co/qualcomm/Nomic-Embed-Text/blob/main/Nomic-Embed-Text.dlc) |
 
121
  ```bash
122
  python -m qai_hub_models.models.nomic_embed_text.export
123
  ```
124
+
 
 
 
 
 
 
 
 
 
 
125
 
126
 
127
  ## How does this work?
precompiled/qualcomm-snapdragon-x-elite/Nomic-Embed-Text.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ff6eee01a7aec9a14bb0fbd0c82ea09e32fb5e968879cba2fccd625ce91e6078
3
  size 274790144
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d99b37fff9a34d0a09cb6f5d658af97aa66c63c31b27c933e1c6ad6882a998c7
3
  size 274790144
precompiled/qualcomm-snapdragon-x-elite/Nomic-Embed-Text.onnx.zip CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cb90905f5f53adb953efbc3b514ac6cd39b1849a217e11b4db23f217e64b95ce
3
- size 253746367
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f31020c935ca654f043d8d06907d09154e21659a27ec7f00617f22bbf47495a2
3
+ size 253746202
precompiled/qualcomm-snapdragon-x-elite/sdk_versions.yml ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ sdk_versions:
2
+ qnn_context_binary:
3
+ qairt: 2.34.2.250528164111_119506
4
+ precompiled_qnn_onnx:
5
+ qairt: 2.33.2.250410134701_117956