Q6_K_XL hangs, no matter what settings I am trying to use

by puchuu - opened Dec 13, 2025

Dec 13, 2025

I am using the latest llama.cpp. Devstral-Small-2-24B-Instruct-2512-GGUF:Q8_K_XL with recommended settings (temp 0.15 and min-p 0.01) works perfectly. Devstral-2-123B-Instruct-2512-GGUF:Q6_K_XL hangs spamming random words in loop with recommended settings and with any other settings I tried. So I think this GGUF is wrong.

puchuu

Dec 14, 2025

I have also tested Q4_K_XL and it is also not working. So I think that GGUF may be broken or llama.cpp is broken. Meanwhile all other models from Unsloth are working perfectly: devstral 2 small, qwen3 coder, qwen3 next, etc.

puchuu changed discussion status to closed Dec 16, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment