Conversion request to Q5_K_M for MLX

by websprockets - opened Apr 4, 2025

MLX Community org Apr 4, 2025

Could someone convert CodeLlama-70B-Instruct to Q5_K_M for MLX? It’s not listed yet and would be great for my use case (science research). Thank you!!

austinbv

MLX Community org Apr 5, 2025

Sure

austinbv

MLX Community org Apr 5, 2025

Wait posted too fast - the KM format is a GGUF format not MLX? Do you just want a Q5?

websprockets

MLX Community org Apr 6, 2025

Thanks for the reply! Glad you held off. I'm waiting to see the models that are coming in for llama 4. Waiting to see if I can get a Maverick model that'll fit (and make the most of) my brand new maxed out M3 Ultra Mac Studio.

websprockets

MLX Community org Apr 6, 2025

Actually, let me be specific... if Llama-4-Maverick-17B-16E-Instruct-6bit 256K with vision enabled ("vision_config"), that's probably my sweet spot.

websprockets

MLX Community org Apr 6, 2025

I see you've made a 4bit, 6bit and 8bit Scout. I believe a 2bit Scout would fit on a M4 Pro Mac mini with 64GB, and still outperform CodeLlama Instruct 70B at 6bit. Any chance there's a 2bit Scout in the works??

austinbv

MLX Community org Apr 7, 2025

I will push some models with mixed quant to so they fit better on macs with MoE

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment