Question about model completeness and M4 Max test
Hi, I noticed the model page shows "CURRENTLY UPLOADING..." and only 11 out of
57 model shards are available (~103GB).
Questions:
- The hardware compatibility says it was tested on M4 Max 128GB with
Inferencer v1.10.1, but the full model would be ~500GB+. How is this possible? - Is there a smaller quantization (like q4.5 or q5.5) that is already
complete and available for download? - When do you expect the full upload to complete?
Thanks!
The 5.6bit version was tested across a M3 Ultra 512GB RAM and M4 Max 128GB RAM with Inferencer's distributed compute feature which pools both machine's RAM together.
Yes, the 4.8bit version is uploaded here: https://huggingface.co/inferencerlabs/GLM-5-MLX-4.8bit
Let me know if you still would like it uploaded given (1).
when i load the model using lmstudio , this error : Error when loading model: ValueError: Missing 2843 parameters:
lm_head.biases,
lm_head.scales,
lm_head.weight,
model.layers.17.input_layernorm.weight,
model.layers.17.mlp.gate.e_score_correction_bias,
model.layers.17.mlp.gate.weight,
model.layers.17.mlp.shared_experts.down_proj.biases,
model.layers.17.mlp.shared_experts.down_proj.scales,
when i load the model using lmstudio , this error : Error when loading model: ValueError: Missing 2843 parameters:
lm_head.biases,
lm_head.scales,
lm_head.weight,
model.layers.17.input_layernorm.weight,
model.layers.17.mlp.gate.e_score_correction_bias,
model.layers.17.mlp.gate.weight,
model.layers.17.mlp.shared_experts.down_proj.biases,
model.layers.17.mlp.shared_experts.down_proj.scales,
i am using m3 ultra 256GB version
The 5.6bit version requires over 512GB RAM.
The 5.6bit version requires over 512GB RAM.
get ,i saw the size only 100GB
Hi, yes I'm interested in this quantized version on my side for a 512GB Mac Studio please.
Could you please upload your 5.6bit please? (I'm using m3-ultra 512 and a m4-max 128 ... both studios)
Thank you!
Love your tweaks! All of them are great quants! ❤️
Oh, one other question... why don't you enable vision on your models? (Not that I really use it.. but I was thinking of using it with moltis.. and vision might be good too! ... and using it with mlx-vlm)
Yes can do, and yes vision will be enabled going forwards - it was mainly a storage optimisation, however the latest conversion Mistral Small 4 has the vision weights.