qaihm-bot commited on
Commit
a26deec
·
verified ·
1 Parent(s): 56fdbf4

See https://github.com/qualcomm/ai-hub-models/releases/v0.49.1 for changelog.

Files changed (1) hide show
  1. README.md +35 -35
README.md CHANGED
@@ -14,7 +14,7 @@ pipeline_tag: text-generation
14
  A text encoder that surpasses OpenAI text-embedding-ada-002 and text-embedding-3-small performance on short and long context tasks.
15
 
16
  This is based on the implementation of Nomic-Embed-Text found [here](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5).
17
- This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/blob/main/qai_hub_models/models/nomic_embed_text) library to export with custom configurations. More details on model performance across various devices, can be found [here](#performance-summary).
18
 
19
  Qualcomm AI Hub Models uses [Qualcomm AI Hub Workbench](https://workbench.aihub.qualcomm.com) to compile, profile, and evaluate this model. [Sign up](https://myaccount.qualcomm.com/signup) to run these models on a hosted Qualcomm® device.
20
 
@@ -27,23 +27,23 @@ Below are pre-exported model assets ready for deployment.
27
 
28
  | Runtime | Precision | Chipset | SDK Versions | Download |
29
  |---|---|---|---|---|
30
- | ONNX | float | Universal | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/nomic_embed_text/releases/v0.48.0/nomic_embed_text-onnx-float.zip)
31
- | QNN_DLC | float | Universal | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/nomic_embed_text/releases/v0.48.0/nomic_embed_text-qnn_dlc-float.zip)
32
- | TFLITE | float | Universal | QAIRT 2.43, TFLite 2.17.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/nomic_embed_text/releases/v0.48.0/nomic_embed_text-tflite-float.zip)
33
 
34
  For more device-specific assets and performance metrics, visit **[Nomic-Embed-Text on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/nomic_embed_text)**.
35
 
36
 
37
  ### Option 2: Export with Custom Configurations
38
 
39
- Use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/blob/main/qai_hub_models/models/nomic_embed_text) Python library to compile and export the model with your own:
40
  - Custom weights (e.g., fine-tuned checkpoints)
41
  - Custom input shapes
42
  - Target device and runtime configurations
43
 
44
  This option is ideal if you need to customize the model beyond the default configuration provided here.
45
 
46
- See our repository for [Nomic-Embed-Text on GitHub](https://github.com/qualcomm/ai-hub-models/blob/main/qai_hub_models/models/nomic_embed_text) for usage instructions.
47
 
48
  ## Model Details
49
 
@@ -58,35 +58,35 @@ See our repository for [Nomic-Embed-Text on GitHub](https://github.com/qualcomm/
58
  ## Performance Summary
59
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
60
  |---|---|---|---|---|---|---
61
- | Nomic-Embed-Text | ONNX | float | Snapdragon® X2 Elite | 3.572 ms | 263 - 263 MB | NPU
62
- | Nomic-Embed-Text | ONNX | float | Snapdragon® X Elite | 8.513 ms | 263 - 263 MB | NPU
63
- | Nomic-Embed-Text | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 5.576 ms | 0 - 451 MB | NPU
64
- | Nomic-Embed-Text | ONNX | float | Qualcomm® QCS8550 (Proxy) | 7.941 ms | 0 - 324 MB | NPU
65
- | Nomic-Embed-Text | ONNX | float | Qualcomm® QCS9075 | 10.976 ms | 0 - 3 MB | NPU
66
- | Nomic-Embed-Text | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 4.339 ms | 0 - 413 MB | NPU
67
- | Nomic-Embed-Text | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 3.881 ms | 0 - 416 MB | NPU
68
- | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® X2 Elite | 3.895 ms | 1 - 1 MB | NPU
69
- | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® X Elite | 8.031 ms | 0 - 0 MB | NPU
70
- | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 5.304 ms | 0 - 444 MB | NPU
71
- | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 28.352 ms | 0 - 415 MB | NPU
72
- | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 7.455 ms | 0 - 2 MB | NPU
73
- | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® SA8775P | 9.687 ms | 0 - 416 MB | NPU
74
- | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® QCS9075 | 10.47 ms | 0 - 2 MB | NPU
75
- | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 10.966 ms | 0 - 428 MB | NPU
76
- | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® SA7255P | 28.352 ms | 0 - 415 MB | NPU
77
- | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® SA8295P | 10.6 ms | 0 - 396 MB | NPU
78
- | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 4.419 ms | 0 - 414 MB | NPU
79
- | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 3.904 ms | 0 - 412 MB | NPU
80
- | Nomic-Embed-Text | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 5.303 ms | 0 - 449 MB | NPU
81
- | Nomic-Embed-Text | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 28.326 ms | 0 - 419 MB | NPU
82
- | Nomic-Embed-Text | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 7.47 ms | 0 - 3 MB | NPU
83
- | Nomic-Embed-Text | TFLITE | float | Qualcomm® SA8775P | 9.697 ms | 0 - 419 MB | NPU
84
- | Nomic-Embed-Text | TFLITE | float | Qualcomm® QCS9075 | 10.61 ms | 0 - 265 MB | NPU
85
- | Nomic-Embed-Text | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 10.912 ms | 0 - 429 MB | NPU
86
- | Nomic-Embed-Text | TFLITE | float | Qualcomm® SA7255P | 28.326 ms | 0 - 419 MB | NPU
87
- | Nomic-Embed-Text | TFLITE | float | Qualcomm® SA8295P | 10.633 ms | 0 - 398 MB | NPU
88
- | Nomic-Embed-Text | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 4.336 ms | 0 - 422 MB | NPU
89
- | Nomic-Embed-Text | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 3.879 ms | 0 - 418 MB | NPU
90
 
91
  ## License
92
  * The license for the original implementation of Nomic-Embed-Text can be found
 
14
  A text encoder that surpasses OpenAI text-embedding-ada-002 and text-embedding-3-small performance on short and long context tasks.
15
 
16
  This is based on the implementation of Nomic-Embed-Text found [here](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5).
17
+ This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/tree/v0.49.1/qai_hub_models/models/nomic_embed_text) library to export with custom configurations. More details on model performance across various devices, can be found [here](#performance-summary).
18
 
19
  Qualcomm AI Hub Models uses [Qualcomm AI Hub Workbench](https://workbench.aihub.qualcomm.com) to compile, profile, and evaluate this model. [Sign up](https://myaccount.qualcomm.com/signup) to run these models on a hosted Qualcomm® device.
20
 
 
27
 
28
  | Runtime | Precision | Chipset | SDK Versions | Download |
29
  |---|---|---|---|---|
30
+ | ONNX | float | Universal | QAIRT 2.42, ONNX Runtime 1.24.1 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/nomic_embed_text/releases/v0.49.1/nomic_embed_text-onnx-float.zip)
31
+ | QNN_DLC | float | Universal | QAIRT 2.43 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/nomic_embed_text/releases/v0.49.1/nomic_embed_text-qnn_dlc-float.zip)
32
+ | TFLITE | float | Universal | QAIRT 2.43, TFLite 2.17.0 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/nomic_embed_text/releases/v0.49.1/nomic_embed_text-tflite-float.zip)
33
 
34
  For more device-specific assets and performance metrics, visit **[Nomic-Embed-Text on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/nomic_embed_text)**.
35
 
36
 
37
  ### Option 2: Export with Custom Configurations
38
 
39
+ Use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/tree/v0.49.1/qai_hub_models/models/nomic_embed_text) Python library to compile and export the model with your own:
40
  - Custom weights (e.g., fine-tuned checkpoints)
41
  - Custom input shapes
42
  - Target device and runtime configurations
43
 
44
  This option is ideal if you need to customize the model beyond the default configuration provided here.
45
 
46
+ See our repository for [Nomic-Embed-Text on GitHub](https://github.com/qualcomm/ai-hub-models/tree/v0.49.1/qai_hub_models/models/nomic_embed_text) for usage instructions.
47
 
48
  ## Model Details
49
 
 
58
  ## Performance Summary
59
  | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit
60
  |---|---|---|---|---|---|---
61
+ | Nomic-Embed-Text | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 3.875 ms | 0 - 415 MB | NPU
62
+ | Nomic-Embed-Text | ONNX | float | Snapdragon® X2 Elite | 3.571 ms | 263 - 263 MB | NPU
63
+ | Nomic-Embed-Text | ONNX | float | Snapdragon® X Elite | 8.5 ms | 263 - 263 MB | NPU
64
+ | Nomic-Embed-Text | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 5.569 ms | 0 - 462 MB | NPU
65
+ | Nomic-Embed-Text | ONNX | float | Qualcomm® QCS8550 (Proxy) | 7.948 ms | 0 - 324 MB | NPU
66
+ | Nomic-Embed-Text | ONNX | float | Qualcomm® QCS9075 | 10.923 ms | 0 - 3 MB | NPU
67
+ | Nomic-Embed-Text | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 4.335 ms | 0 - 413 MB | NPU
68
+ | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 3.906 ms | 0 - 413 MB | NPU
69
+ | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® X2 Elite | 3.829 ms | 0 - 0 MB | NPU
70
+ | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® X Elite | 8.004 ms | 0 - 0 MB | NPU
71
+ | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 5.331 ms | 0 - 446 MB | NPU
72
+ | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® QCS8275 (Proxy) | 28.373 ms | 0 - 415 MB | NPU
73
+ | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 7.456 ms | 0 - 2 MB | NPU
74
+ | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® SA8775P | 9.711 ms | 0 - 415 MB | NPU
75
+ | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® QCS9075 | 10.473 ms | 0 - 2 MB | NPU
76
+ | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 11.48 ms | 0 - 430 MB | NPU
77
+ | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® SA7255P | 28.373 ms | 0 - 415 MB | NPU
78
+ | Nomic-Embed-Text | QNN_DLC | float | Qualcomm® SA8295P | 10.611 ms | 0 - 396 MB | NPU
79
+ | Nomic-Embed-Text | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 4.29 ms | 0 - 413 MB | NPU
80
+ | Nomic-Embed-Text | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 3.874 ms | 0 - 419 MB | NPU
81
+ | Nomic-Embed-Text | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 5.313 ms | 0 - 453 MB | NPU
82
+ | Nomic-Embed-Text | TFLITE | float | Qualcomm® QCS8275 (Proxy) | 28.33 ms | 0 - 419 MB | NPU
83
+ | Nomic-Embed-Text | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 7.332 ms | 0 - 3 MB | NPU
84
+ | Nomic-Embed-Text | TFLITE | float | Qualcomm® SA8775P | 9.682 ms | 0 - 419 MB | NPU
85
+ | Nomic-Embed-Text | TFLITE | float | Qualcomm® QCS9075 | 10.634 ms | 0 - 265 MB | NPU
86
+ | Nomic-Embed-Text | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 11.008 ms | 0 - 430 MB | NPU
87
+ | Nomic-Embed-Text | TFLITE | float | Qualcomm® SA7255P | 28.33 ms | 0 - 419 MB | NPU
88
+ | Nomic-Embed-Text | TFLITE | float | Qualcomm® SA8295P | 10.642 ms | 0 - 397 MB | NPU
89
+ | Nomic-Embed-Text | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 4.379 ms | 0 - 422 MB | NPU
90
 
91
  ## License
92
  * The license for the original implementation of Nomic-Embed-Text can be found