|
|
--- |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- NPU |
|
|
--- |
|
|
# Phi-3.5-Mini |
|
|
|
|
|
Run **Phi-3.5-Mini** optimized for **Qualcomm NPUs** with [nexaSDK](https://sdk.nexa.ai). |
|
|
|
|
|
## Quickstart |
|
|
|
|
|
1. **Install nexaSDK** and create a free account at [sdk.nexa.ai](https://sdk.nexa.ai) |
|
|
2. **Activate your device** with your access token: |
|
|
|
|
|
```bash |
|
|
nexa config set license '<access_token>' |
|
|
``` |
|
|
3. Run the model on Qualcomm NPU in one line: |
|
|
|
|
|
```bash |
|
|
nexa infer NexaAI/phi3.5-mini-npu |
|
|
``` |
|
|
|
|
|
|
|
|
## Model Description |
|
|
|
|
|
**Phi-3.5-Mini** is a \~3.8B-parameter instruction-tuned language model from Microsoft’s Phi family. |
|
|
It’s designed to deliver strong reasoning and instruction-following quality within a compact footprint, making it ideal for **on-device** and **latency-sensitive** applications. This Turbo build uses Nexa’s Qualcomm NPU path for faster inference and higher throughput while preserving model quality. |
|
|
|
|
|
## Features |
|
|
|
|
|
* **Lightweight yet capable**: strong performance with small memory and compute budgets. |
|
|
* **Conversational AI**: context-aware dialogue for assistants and agents. |
|
|
* **Content generation**: drafting, completion, summarization, code comments, and more. |
|
|
* **Reasoning & analysis**: math/logic step-by-step problem solving. |
|
|
* **Multilingual**: supports understanding and generation across multiple languages. |
|
|
* **Customizable**: fine-tune or apply adapters for domain-specific use. |
|
|
|
|
|
## Use Cases |
|
|
|
|
|
* Personal and enterprise chatbots |
|
|
* On-device AI applications and offline assistants |
|
|
* Document/report/email summarization |
|
|
* Education and tutoring tools |
|
|
* Vertical solutions (e.g., healthcare, finance, legal), with proper guardrails |
|
|
|
|
|
## Inputs and Outputs |
|
|
|
|
|
**Input**: |
|
|
|
|
|
* Text prompts or conversation history (tokenized input sequences). |
|
|
|
|
|
**Output**: |
|
|
|
|
|
* Generated text: responses, explanations, or creative content. |
|
|
* Optionally: raw logits/probabilities for advanced downstream tasks. |
|
|
|
|
|
## License |
|
|
|
|
|
* This model is released under the **Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0)** license. |
|
|
* Non-commercial use, modification, and redistribution are permitted with attribution. |
|
|
* For commercial licensing, please contact **dev@nexa.ai**. |
|
|
|
|
|
## References |
|
|
|
|
|
* [Microsoft – Phi Models](https://www.microsoft.com/en-us/research/project/phi-3) |
|
|
* [Hugging Face Model Card (Phi-3.5-Mini-Instruct)](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) |
|
|
* [Phi-3 Technical Report (blog/overview)](https://azure.microsoft.com/en-us/blog/introducing-phi-3) |