MediPhi-Instruct (MLX · 4-bit)

This repository contains an MLX-format 4-bit quantized version of
microsoft/MediPhi-Instruct,
converted using mlx-lm for efficient on-device inference on Apple silicon.

This model is intended for iOS / iPadOS / macOS usage where memory and power constraints require aggressive quantization while preserving clinical reasoning quality.


Model details

  • Base model: MediPhi-Instruct (Phi-3 family)
  • Parameters: ~3.8B
  • Quantization: 4-bit (MLX)
  • Format: MLX (not GGUF)
  • Intended use: On-device medical and clinical QA, decision support, and explanations
  • Language: English

⚠️ This is a conversion only. No additional fine-tuning was performed.


Why MLX 4-bit?

Compared to larger 4–7B medical models, MediPhi-Instruct shows:

  • Strong clinical reasoning per parameter
  • Better robustness under 4-bit quantization
  • Lower memory footprint suitable for mobile devices

This makes it a strong candidate for on-device medical assistants on iPhone and iPad.


Usage (MLX-LM)

Install

pip install mlx-lm
Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
BF16
·
U32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bisonnetworking/MediPhi-Instruct-mlx-4bit

Quantized
(9)
this model

Datasets used to train bisonnetworking/MediPhi-Instruct-mlx-4bit