v0.48.0
Browse filesSee https://github.com/qualcomm/ai-hub-models/releases/v0.48.0 for changelog.
README.md
CHANGED
|
@@ -16,7 +16,7 @@ pipeline_tag: text-generation
|
|
| 16 |
Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B.
|
| 17 |
|
| 18 |
This is based on the implementation of Falcon3-7B-Instruct found [here](https://huggingface.co/tiiuae/Falcon3-7B-Instruct).
|
| 19 |
-
This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the [Qualcomm® AI Hub Models](https://github.com/
|
| 20 |
|
| 21 |
Qualcomm AI Hub Models uses [Qualcomm AI Hub Workbench](https://workbench.aihub.qualcomm.com) to compile, profile, and evaluate this model. [Sign up](https://myaccount.qualcomm.com/signup) to run these models on a hosted Qualcomm® device.
|
| 22 |
|
|
@@ -34,14 +34,14 @@ Download pre-exported model assets from **[Falcon3-7B-Instruct on Qualcomm® AI
|
|
| 34 |
|
| 35 |
### Option 2: Export with Custom Configurations
|
| 36 |
|
| 37 |
-
Use the [Qualcomm® AI Hub Models](https://github.com/
|
| 38 |
- Custom weights (e.g., fine-tuned checkpoints)
|
| 39 |
- Custom input shapes
|
| 40 |
- Target device and runtime configurations
|
| 41 |
|
| 42 |
This option is ideal if you need to customize the model beyond the default configuration provided here.
|
| 43 |
|
| 44 |
-
See our repository for [Falcon3-7B-Instruct on GitHub](https://github.com/
|
| 45 |
|
| 46 |
## Model Details
|
| 47 |
|
|
@@ -59,8 +59,10 @@ See our repository for [Falcon3-7B-Instruct on GitHub](https://github.com/quic/a
|
|
| 59 |
| Model | Runtime | Precision | Chipset | Context Length | Response Rate (tokens per second) | Time To First Token (range, seconds)
|
| 60 |
|---|---|---|---|---|---|---
|
| 61 |
| Falcon3-7B-Instruct | GENIE | w4a16 | Snapdragon® 8 Elite Mobile | 4096 | 14.02985 | 0.1265205 - 4.048656
|
|
|
|
| 62 |
| Falcon3-7B-Instruct | GENIE | w4a16 | Snapdragon® X Elite | 4096 | 9.96829 | 0.1973798 - 6.3161536
|
| 63 |
| Falcon3-7B-Instruct | GENIE | w4a16 | Snapdragon® 8 Elite Gen 5 Mobile | 4096 | 15.8303 | 0.10903 - 3.488966
|
|
|
|
| 64 |
|
| 65 |
## License
|
| 66 |
* The license for the original implementation of Falcon3-7B-Instruct can be found
|
|
|
|
| 16 |
Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B.
|
| 17 |
|
| 18 |
This is based on the implementation of Falcon3-7B-Instruct found [here](https://huggingface.co/tiiuae/Falcon3-7B-Instruct).
|
| 19 |
+
This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/blob/main/qai_hub_models/models/falcon_v3_7b_instruct) library to export with custom configurations. More details on model performance across various devices, can be found [here](#performance-summary).
|
| 20 |
|
| 21 |
Qualcomm AI Hub Models uses [Qualcomm AI Hub Workbench](https://workbench.aihub.qualcomm.com) to compile, profile, and evaluate this model. [Sign up](https://myaccount.qualcomm.com/signup) to run these models on a hosted Qualcomm® device.
|
| 22 |
|
|
|
|
| 34 |
|
| 35 |
### Option 2: Export with Custom Configurations
|
| 36 |
|
| 37 |
+
Use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/blob/main/qai_hub_models/models/falcon_v3_7b_instruct) Python library to compile and export the model with your own:
|
| 38 |
- Custom weights (e.g., fine-tuned checkpoints)
|
| 39 |
- Custom input shapes
|
| 40 |
- Target device and runtime configurations
|
| 41 |
|
| 42 |
This option is ideal if you need to customize the model beyond the default configuration provided here.
|
| 43 |
|
| 44 |
+
See our repository for [Falcon3-7B-Instruct on GitHub](https://github.com/qualcomm/ai-hub-models/blob/main/qai_hub_models/models/falcon_v3_7b_instruct) for usage instructions.
|
| 45 |
|
| 46 |
## Model Details
|
| 47 |
|
|
|
|
| 59 |
| Model | Runtime | Precision | Chipset | Context Length | Response Rate (tokens per second) | Time To First Token (range, seconds)
|
| 60 |
|---|---|---|---|---|---|---
|
| 61 |
| Falcon3-7B-Instruct | GENIE | w4a16 | Snapdragon® 8 Elite Mobile | 4096 | 14.02985 | 0.1265205 - 4.048656
|
| 62 |
+
| Falcon3-7B-Instruct | GENIE | w4a16 | Snapdragon® X2 Elite | 4096 | 22.628025817871094 | 0.1165148 - 3.7284736
|
| 63 |
| Falcon3-7B-Instruct | GENIE | w4a16 | Snapdragon® X Elite | 4096 | 9.96829 | 0.1973798 - 6.3161536
|
| 64 |
| Falcon3-7B-Instruct | GENIE | w4a16 | Snapdragon® 8 Elite Gen 5 Mobile | 4096 | 15.8303 | 0.10903 - 3.488966
|
| 65 |
+
| Falcon3-7B-Instruct | GENIE | w4a16 | Qualcomm® QCS9075 | 4096 | 10.098000383377075 | 0.1680149 - 5.3764768
|
| 66 |
|
| 67 |
## License
|
| 68 |
* The license for the original implementation of Falcon3-7B-Instruct can be found
|