v0.33.1
Browse filesSee https://github.com/quic/ai-hub-models/releases/v0.33.1 for changelog.
README.md
CHANGED
|
@@ -11,10 +11,10 @@ pipeline_tag: text-generation
|
|
| 11 |
|
| 12 |

|
| 13 |
|
| 14 |
-
#
|
| 15 |
-
## Large Language Model supporting
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
More details on model performance across various devices, can be found [here](https://aihub.qualcomm.com/models/allam_7b).
|
| 20 |
|
|
@@ -35,7 +35,7 @@ Allam 7B is SDAIA's first generation edge model, optimized for performance on Sn
|
|
| 35 |
|
| 36 |
| Model | Precision | Device | Chipset | Target Runtime | Response Rate (tokens per second) | Time To First Token (range, seconds)
|
| 37 |
|---|---|---|---|---|---|
|
| 38 |
-
| ALLaM-7B
|
| 39 |
|
| 40 |
## Deploy Allam 7B on Snapdragon X Elite NPU
|
| 41 |
|
|
|
|
| 11 |
|
| 12 |

|
| 13 |
|
| 14 |
+
# ALLaM-7B: Optimized for Mobile Deployment
|
| 15 |
+
## Large Language Model supporting Arabic and English
|
| 16 |
|
| 17 |
+
ALLaM 7B is SDAIA's first generation edge model, optimized for performance on Snapdragon X Elite.
|
| 18 |
|
| 19 |
More details on model performance across various devices, can be found [here](https://aihub.qualcomm.com/models/allam_7b).
|
| 20 |
|
|
|
|
| 35 |
|
| 36 |
| Model | Precision | Device | Chipset | Target Runtime | Response Rate (tokens per second) | Time To First Token (range, seconds)
|
| 37 |
|---|---|---|---|---|---|
|
| 38 |
+
| ALLaM-7B | w4a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN_CONTEXT_BINARY | 9.5 | 0.23854499999999998 - 1.399168 | -- | -- |
|
| 39 |
|
| 40 |
## Deploy Allam 7B on Snapdragon X Elite NPU
|
| 41 |
|