May I ask if this model is compatible and runnable via OMLX?

#2
by ssbg2 - opened

Hi there, thanks a lot for your great work. I'm looking for a quantized GLM-5.1 model that can run on M3 Ultra 512GB. So far the OMLX framework has shown pretty stable inference performance. However, another 2.5-bit quantized model(inferencerlabs/GLM-5.1-MLX-2.5bit-INF) I tested earlier threw errors when running. I’d like to ask if you have ever deployed models with OMLX before? Thanks again for your dedication and contribution.

Sorry, I haven't tried OMLX. Only stock mlx, mlx-lm, and mlx-vlm, which are the core libraries most MLX servers are built on. As far as I can tell, OMLX uses these too so you're most likely fine?

Sign up or log in to comment