Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -32,9 +32,43 @@ tags:
 - nlp
 - code
 - mlx
 widget:
 - messages:
   - role: user
     content: Can you provide ways to eat combinations of bananas and dragonfruits?
 base_model: microsoft/Phi-4-mini-instruct
 ---

 - nlp
 - code
 - mlx
+- apple-silicon
+- on-device
+- phi
+- local-llm
+- quantized
 widget:
 - messages:
   - role: user
     content: Can you provide ways to eat combinations of bananas and dragonfruits?
 base_model: microsoft/Phi-4-mini-instruct
 ---
+# Phi-4-mini-instruct (MLX 4-bit)
+This is a **4-bit MLX quantized** version of `microsoft/Phi-4-mini-instruct`, optimized for **Apple Silicon** and **local / on-device inference**.
+## Benchmark Environment
+- Device: MacBook Pro (M3 Pro)
+- Runtime: MLX
+- Precision: 4-bit (~4.5 bits per weight)
+##  Performance (Measured)
+- Disk size: ~2.0 GB
+- Peak memory: ~2.24 GB
+- Generation speed: ~56 tokens/sec
+> Benchmarks were collected on macOS (M3 Pro).
+> iPhone / iPad performance will vary depending on hardware and memory.
+## Usage
+```bash
+mlx_lm.generate \
+  --model Irfanuruchi/Phi-4-mini-instruct-MLX-4bit \
+  --prompt "Give me 5 short offline assistant tips." \
+  --max-tokens 120
+```
+## License
+Original model license applies. See `microsoft/Phi-4-mini-instruct`.