Irfanuruchi commited on
Commit
16cfc7f
·
verified ·
1 Parent(s): 6157987

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -32,9 +32,43 @@ tags:
32
  - nlp
33
  - code
34
  - mlx
 
 
 
 
 
35
  widget:
36
  - messages:
37
  - role: user
38
  content: Can you provide ways to eat combinations of bananas and dragonfruits?
39
  base_model: microsoft/Phi-4-mini-instruct
40
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  - nlp
33
  - code
34
  - mlx
35
+ - apple-silicon
36
+ - on-device
37
+ - phi
38
+ - local-llm
39
+ - quantized
40
  widget:
41
  - messages:
42
  - role: user
43
  content: Can you provide ways to eat combinations of bananas and dragonfruits?
44
  base_model: microsoft/Phi-4-mini-instruct
45
  ---
46
+
47
+
48
+ # Phi-4-mini-instruct (MLX 4-bit)
49
+
50
+ This is a **4-bit MLX quantized** version of `microsoft/Phi-4-mini-instruct`, optimized for **Apple Silicon** and **local / on-device inference**.
51
+
52
+ ## Benchmark Environment
53
+ - Device: MacBook Pro (M3 Pro)
54
+ - Runtime: MLX
55
+ - Precision: 4-bit (~4.5 bits per weight)
56
+
57
+ ## Performance (Measured)
58
+ - Disk size: ~2.0 GB
59
+ - Peak memory: ~2.24 GB
60
+ - Generation speed: ~56 tokens/sec
61
+
62
+ > Benchmarks were collected on macOS (M3 Pro).
63
+ > iPhone / iPad performance will vary depending on hardware and memory.
64
+
65
+ ## Usage
66
+ ```bash
67
+ mlx_lm.generate \
68
+ --model Irfanuruchi/Phi-4-mini-instruct-MLX-4bit \
69
+ --prompt "Give me 5 short offline assistant tips." \
70
+ --max-tokens 120
71
+ ```
72
+
73
+ ## License
74
+ Original model license applies. See `microsoft/Phi-4-mini-instruct`.