MLX artifacts for the IEEE Cloud Summit 2026 paper. 3 LLMs × 5 precisions (BF16/Q8/Q6/Q4/Q3, gs=64).
-
plawanrath/phi-3.5-mini-instruct-q3-mlx-cba
Text Generation • 0.5B • Updated • 97 -
plawanrath/phi-3.5-mini-instruct-q4-mlx-cba
Text Generation • 0.6B • Updated • 11 -
plawanrath/phi-3.5-mini-instruct-q6-mlx-cba
Text Generation • 0.8B • Updated • 13 -
plawanrath/mistral-7b-instruct-v0.3-q3-mlx-cba
Text Generation • 0.9B • Updated • 10