Spaces:

mlx-community
/

README

Running

App Files Files Community

EuroLLM 22B conversion on a Mac Studio M2 Max 32Gb

#28

by HugMischa - opened Jan 18

Discussion

HugMischa

MLX Community org Jan 18

•

edited Jan 18

Hi,
I have an issue with a local conversion of the EuroLLM 22B model. I can use the Q6-version from the community with no issues. Now I would like to test a mixed 4-6 quantization with a group size of 128. When I try to convert the model using:

mlx_lm.convert \
      --hf-path mlx-community/EuroLLM-22B-Instruct-2512-mlx-bf16 \
      --mlx-path models/EuroLLM-22b-mixed-4-6 \
      -q \
      --quant-predicate mixed_4_6

and these models:

mlx-community/EuroLLM-22B-Instruct-2512-mlx-bf16 or
utter-project/EuroLLM-22B-Instruct-2512

I get this error:

libc++abi: terminating due to uncaught exception of type std::runtime_error: [METAL] Command buffer execution failed: Caused GPU Timeout Error (00000002:kIOGPUCommandBufferCallbackErrorTimeout)

I increased the time out with:

sudo sysctl debug.lowpri_throttle_enabled=0
export METAL_DEVICE_TIMEOUT=7200

Is my Mac Studio just not suitable for the job? Or is it a certain setting I need to change? I have successfully converted Codestral 22B before on my Mac.
Currently I am using MLX-LM 0.30.1.

HugMischa

MLX Community org Jan 18

Ok, found the reason for this.... Bit stupid. I had the huffingface cache on a external disk. Thought that Thunderbolt with a SSD disk would be no problem. It is not for most model, but it ended up being a big issue with EuroLLM 22B. So GPU timeout was caused by tot much latency from my Mac M2 Thunderbolt Connection.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment