MediPhi-Instruct (MLX · 4-bit)
This repository contains an MLX-format 4-bit quantized version ofmicrosoft/MediPhi-Instruct,
converted using mlx-lm for efficient on-device inference on Apple silicon.
This model is intended for iOS / iPadOS / macOS usage where memory and power constraints require aggressive quantization while preserving clinical reasoning quality.
Model details
- Base model: MediPhi-Instruct (Phi-3 family)
- Parameters: ~3.8B
- Quantization: 4-bit (MLX)
- Format: MLX (not GGUF)
- Intended use: On-device medical and clinical QA, decision support, and explanations
- Language: English
⚠️ This is a conversion only. No additional fine-tuning was performed.
Why MLX 4-bit?
Compared to larger 4–7B medical models, MediPhi-Instruct shows:
- Strong clinical reasoning per parameter
- Better robustness under 4-bit quantization
- Lower memory footprint suitable for mobile devices
This makes it a strong candidate for on-device medical assistants on iPhone and iPad.
Usage (MLX-LM)
Install
pip install mlx-lm
- Downloads last month
- -
Model tree for bisonnetworking/MediPhi-Instruct-mlx-4bit
Base model
microsoft/Phi-3.5-mini-instruct
Finetuned
microsoft/MediPhi-Instruct