May I ask if this model is compatible and runnable via OMLX?

by ssbg2 - opened 10 days ago

Hi there, thanks a lot for your great work. I'm looking for a quantized GLM-5.1 model that can run on M3 Ultra 512GB. So far the OMLX framework has shown pretty stable inference performance. However, another 2.5-bit quantized model(inferencerlabs/GLM-5.1-MLX-2.5bit-INF) I tested earlier threw errors when running. I’d like to ask if you have ever deployed models with OMLX before? Thanks again for your dedication and contribution.

spicyneuron

Owner 10 days ago

Sorry, I haven't tried OMLX. Only stock mlx, mlx-lm, and mlx-vlm, which are the core libraries most MLX servers are built on. As far as I can tell, OMLX uses these too so you're most likely fine?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment