Phi-4-mini-instruct (MLX 4-bit)
This is a 4-bit MLX quantized version of microsoft/Phi-4-mini-instruct, optimized for Apple Silicon and local / on-device inference.
Benchmark Environment
- Device: MacBook Pro (M3 Pro)
- Runtime: MLX
- Precision: 4-bit (~4.5 bits per weight)
Performance (Measured)
- Disk size: ~2.0 GB
- Peak memory: ~2.24 GB
- Generation speed: ~56 tokens/sec
Benchmarks were collected on macOS (M3 Pro).
iPhone / iPad performance will vary depending on hardware and memory.
Usage
mlx_lm.generate \
--model Irfanuruchi/Phi-4-mini-instruct-MLX-4bit \
--prompt "Give me 5 short offline assistant tips." \
--max-tokens 120
License
Original model license applies. See microsoft/Phi-4-mini-instruct.
- Downloads last month
- 91
Model size
0.6B params
Tensor type
BF16
·
U32
·
Hardware compatibility
Log In
to add your hardware
4-bit
Model tree for Irfanuruchi/Phi-4-mini-instruct-MLX-4bit
Base model
microsoft/Phi-4-mini-instruct