Q6_K_XL hangs, no matter what settings I am trying to use
#4
by puchuu - opened
I am using the latest llama.cpp. Devstral-Small-2-24B-Instruct-2512-GGUF:Q8_K_XL with recommended settings (temp 0.15 and min-p 0.01) works perfectly. Devstral-2-123B-Instruct-2512-GGUF:Q6_K_XL hangs spamming random words in loop with recommended settings and with any other settings I tried. So I think this GGUF is wrong.
I have also tested Q4_K_XL and it is also not working. So I think that GGUF may be broken or llama.cpp is broken. Meanwhile all other models from Unsloth are working perfectly: devstral 2 small, qwen3 coder, qwen3 next, etc.
puchuu changed discussion status to closed