Update README.md
Browse files
README.md
CHANGED
|
@@ -32,9 +32,43 @@ tags:
|
|
| 32 |
- nlp
|
| 33 |
- code
|
| 34 |
- mlx
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
widget:
|
| 36 |
- messages:
|
| 37 |
- role: user
|
| 38 |
content: Can you provide ways to eat combinations of bananas and dragonfruits?
|
| 39 |
base_model: microsoft/Phi-4-mini-instruct
|
| 40 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
- nlp
|
| 33 |
- code
|
| 34 |
- mlx
|
| 35 |
+
- apple-silicon
|
| 36 |
+
- on-device
|
| 37 |
+
- phi
|
| 38 |
+
- local-llm
|
| 39 |
+
- quantized
|
| 40 |
widget:
|
| 41 |
- messages:
|
| 42 |
- role: user
|
| 43 |
content: Can you provide ways to eat combinations of bananas and dragonfruits?
|
| 44 |
base_model: microsoft/Phi-4-mini-instruct
|
| 45 |
---
|
| 46 |
+
|
| 47 |
+
|
| 48 |
+
# Phi-4-mini-instruct (MLX 4-bit)
|
| 49 |
+
|
| 50 |
+
This is a **4-bit MLX quantized** version of `microsoft/Phi-4-mini-instruct`, optimized for **Apple Silicon** and **local / on-device inference**.
|
| 51 |
+
|
| 52 |
+
## Benchmark Environment
|
| 53 |
+
- Device: MacBook Pro (M3 Pro)
|
| 54 |
+
- Runtime: MLX
|
| 55 |
+
- Precision: 4-bit (~4.5 bits per weight)
|
| 56 |
+
|
| 57 |
+
## Performance (Measured)
|
| 58 |
+
- Disk size: ~2.0 GB
|
| 59 |
+
- Peak memory: ~2.24 GB
|
| 60 |
+
- Generation speed: ~56 tokens/sec
|
| 61 |
+
|
| 62 |
+
> Benchmarks were collected on macOS (M3 Pro).
|
| 63 |
+
> iPhone / iPad performance will vary depending on hardware and memory.
|
| 64 |
+
|
| 65 |
+
## Usage
|
| 66 |
+
```bash
|
| 67 |
+
mlx_lm.generate \
|
| 68 |
+
--model Irfanuruchi/Phi-4-mini-instruct-MLX-4bit \
|
| 69 |
+
--prompt "Give me 5 short offline assistant tips." \
|
| 70 |
+
--max-tokens 120
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
## License
|
| 74 |
+
Original model license applies. See `microsoft/Phi-4-mini-instruct`.
|