nexaml commited on
Commit
f8ec247
·
verified ·
1 Parent(s): c2c6c4d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ tags:
4
+ - NPU
5
+ ---
6
+ # Phi-4-mini
7
+
8
+ Run **Phi-4-mini** optimized for **Qualcomm NPUs** with [nexaSDK](https://sdk.nexa.ai).
9
+
10
+ ## Model Description
11
+
12
+ **Phi-4-mini** is a \~3.8B-parameter instruction-tuned model from Microsoft’s Phi-4 family.
13
+ Trained on a blend of synthetic “textbook-style” data, filtered public web content, curated books/Q\&A, and high-quality supervised chat data, it emphasizes **reasoning-dense** capabilities while maintaining a compact footprint. This NPU **Turbo** build uses Nexa’s Qualcomm backend (QNN/Hexagon) to deliver **lower latency** and **higher throughput** on-device, with support for **128K context** and efficient long-context memory handling.
14
+
15
+ ## Features
16
+
17
+ * **Lightweight yet capable**: strong reasoning (math/logic) in a compact 3.8B model.
18
+ * **Instruction-following**: enhanced SFT + DPO alignment for reliable chat.
19
+ * **Content generation**: drafting, completion, summarization, code comments, and more.
20
+ * **Conversational AI**: context-aware assistants/agents with long-context support (128K).
21
+ * **NPU-Turbo path**: INT8/INT4 quantization, op fusion, and KV-cache residency for Snapdragon® NPUs via nexaSDK.
22
+ * **Customizable**: fine-tune/adapt for domain-specific or enterprise use.
23
+
24
+ ## Use Cases
25
+
26
+ * Personal & enterprise chatbots
27
+ * On-device/offline assistants (latency-bound scenarios)
28
+ * Document/report/email summarization
29
+ * Education, tutoring, and STEM reasoning tools
30
+ * Vertical applications (e.g., healthcare, finance, legal) with appropriate safeguards
31
+
32
+ ## Inputs and Outputs
33
+
34
+ **Input**:
35
+
36
+ * Text prompts or conversation history (chat-format, tokenized sequences).
37
+
38
+ **Output**:
39
+
40
+ * Generated text: responses, explanations, or creative content.
41
+ * Optionally: raw logits/probabilities for advanced downstream tasks.
42
+
43
+ ## License
44
+ This model is released under the **Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0)** license.
45
+ Non-commercial use, modification, and redistribution are permitted with attribution.
46
+ For commercial licensing, please contact **dev@nexa.ai**.
47
+
48
+ ## References
49
+ 📰 [Phi-4-mini Microsoft Blog](https://aka.ms/phi4-feb2025) <br>
50
+ 📖 [Phi-4-mini Technical Report](https://aka.ms/phi-4-multimodal/techreport) <br>
51
+ 👩‍🍳 [Phi Cookbook](https://github.com/microsoft/PhiCookBook) <br>
52
+ 🚀 [Model paper](https://huggingface.co/papers/2503.01743)