MediPhi-Instruct (MLX · 4-bit)

This repository contains an MLX-format 4-bit quantized version of
microsoft/MediPhi-Instruct,
converted using mlx-lm for efficient on-device inference on Apple silicon.

This model is intended for iOS / iPadOS / macOS usage where memory and power constraints require aggressive quantization while preserving clinical reasoning quality.

Model details

Base model: MediPhi-Instruct (Phi-3 family)
Parameters: ~3.8B
Quantization: 4-bit (MLX)
Format: MLX (not GGUF)
Intended use: On-device medical and clinical QA, decision support, and explanations
Language: English

⚠️ This is a conversion only. No additional fine-tuning was performed.

Why MLX 4-bit?

Compared to larger 4–7B medical models, MediPhi-Instruct shows:

Strong clinical reasoning per parameter
Better robustness under 4-bit quantization
Lower memory footprint suitable for mobile devices

This makes it a strong candidate for on-device medical assistants on iPhone and iPad.

Usage (MLX-LM)

Install

pip install mlx-lm

Downloads last month: 14

Safetensors

Model size

0.6B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for bisonnetworking/MediPhi-Instruct-mlx-4bit

Base model

microsoft/Phi-3.5-mini-instruct

Finetuned

microsoft/MediPhi-Instruct

Quantized

(12)

this model

bisonnetworking
/

MediPhi-Instruct-mlx-4bit