TurkishCodeMan commited on
Commit
c910921
·
verified ·
1 Parent(s): 8d0786d

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +69 -0
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - gguf
5
+ - llama.cpp
6
+ - ios
7
+ - mobile
8
+ - qwen2.5
9
+ - quantized
10
+ base_model: Qwen/Qwen2.5-3B-Instruct
11
+ model_type: qwen2
12
+ language:
13
+ - en
14
+ - tr
15
+ ---
16
+
17
+ # Qwen2.5-3B-Instruct-Q4_K_M-GGUF
18
+
19
+ GGUF quantized version of Qwen2.5-3B-Instruct for mobile and edge deployment.
20
+
21
+ ## Model Details
22
+
23
+ - **Base Model:** [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct)
24
+ - **Quantization:** Q4_K_M (4-bit quantization with K-quants)
25
+ - **File Size:** ~1.8 GB
26
+ - **Format:** GGUF
27
+
28
+ ## Usage
29
+
30
+ ### With llama.cpp
31
+
32
+ ```bash
33
+ ./llama-cli -m Qwen2.5-3B-Instruct-Q4_K_M.gguf -p "Hello, how are you?"
34
+ ```
35
+
36
+ ### With llama.swiftui (iOS)
37
+
38
+ This model is optimized for running on iOS devices using the llama.swiftui app.
39
+
40
+ 1. Download the model
41
+ 2. Copy to app's Documents folder
42
+ 3. Load and chat!
43
+
44
+ ### Chat Template
45
+
46
+ ```
47
+ <|im_start|>system
48
+ You are a helpful assistant.<|im_end|>
49
+ <|im_start|>user
50
+ {user_message}<|im_end|>
51
+ <|im_start|>assistant
52
+ ```
53
+
54
+ ## Performance
55
+
56
+ | Device | Tokens/sec |
57
+ |--------|------------|
58
+ | iPhone 15 Pro | ~15-25 t/s |
59
+ | iPhone 14 | ~10-15 t/s |
60
+ | M1 Mac | ~30-50 t/s |
61
+
62
+ ## License
63
+
64
+ Apache 2.0 (following the base model license)
65
+
66
+ ## Credits
67
+
68
+ - Original model by [Qwen Team](https://huggingface.co/Qwen)
69
+ - Quantization and mobile optimization by TurkishCodeMan