Spaces:
Running
EuroLLM 22B conversion on a Mac Studio M2 Max 32Gb
Hi,
I have an issue with a local conversion of the EuroLLM 22B model. I can use the Q6-version from the community with no issues. Now I would like to test a mixed 4-6 quantization with a group size of 128. When I try to convert the model using:
mlx_lm.convert \
--hf-path mlx-community/EuroLLM-22B-Instruct-2512-mlx-bf16 \
--mlx-path models/EuroLLM-22b-mixed-4-6 \
-q \
--quant-predicate mixed_4_6
and these models:
- mlx-community/EuroLLM-22B-Instruct-2512-mlx-bf16 or
- utter-project/EuroLLM-22B-Instruct-2512
I get this error:
libc++abi: terminating due to uncaught exception of type std::runtime_error: [METAL] Command buffer execution failed: Caused GPU Timeout Error (00000002:kIOGPUCommandBufferCallbackErrorTimeout)
I increased the time out with:
sudo sysctl debug.lowpri_throttle_enabled=0
export METAL_DEVICE_TIMEOUT=7200
Is my Mac Studio just not suitable for the job? Or is it a certain setting I need to change? I have successfully converted Codestral 22B before on my Mac.
Currently I am using MLX-LM 0.30.1.
Ok, found the reason for this.... Bit stupid. I had the huffingface cache on a external disk. Thought that Thunderbolt with a SSD disk would be no problem. It is not for most model, but it ended up being a big issue with EuroLLM 22B. So GPU timeout was caused by tot much latency from my Mac M2 Thunderbolt Connection.