How to fix Qwen 3.5 Quant/tool issues + NEW Sheep And a few lost sheep...
Hey :
re: TOOLS usage issues :
Use Q6 or Q8 quants ; the issue lies in new tensors, which should be Q8/BF16 (same for all quants) // NOT quantized ; but instead are quanted at the "quant level" so to speak.
This seems to be drastically affecting tool usage.
To manually create a quant via llamacpp with the tensors corrected use (main.gguf is created at convert-to-gguf ... step):
./llama-quantize --tensor-type ssm_alpha=bf16 --tensor-type ssm_beta=bf16 x:/main.gguf D:/lms/test/llms/4B-MODEL-NAME-Q4_K_S.gguf Q4_K_S 8
(llama-quantize == llama-quantize.exe)
(this will work for all Qwen 3.5 models, regardless of parameter size)
NEW SHEEP:
https://huggingface.co/DavidAU/Qwen3.5-9B-Star-Trek-TNG-DS9-Heretic-Uncensored-Thinking
LOST SHEEP:
https://huggingface.co/DavidAU/Qwen3.5-9B-Polaris-HighIQ-THINKING
https://huggingface.co/DavidAU/Qwen3.5-9B-Deckard-Claude-DIMOE-Uncensored-Heretic-Thinking
https://huggingface.co/DavidAU/Qwen3.5-9B-Claude-4.6-Opus-Deckard-V4.2-Uncensored-Heretic-Thinking
https://huggingface.co/DavidAU/Qwen3.5-2B-Claude-4.6-OS-Auto-Variable-HERETIC-UNCENSORED-THINKING
uhm Im not sure what I need to do, so I will just quant it with normal queue. if something needs to be adjusted during the quantization process (but still runs on the main llama cpp branch), I need to ask nico to help... please let me know
I started queueing and got some random error and it doesnt seem to queue, but queued at the same, doesnt show up in the queue yet says it's there...
It's (hopefully) queued!
You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Qwen3.5-9B-Star-Trek-TNG-DS9-Heretic-Uncensored-Thinking-GGUF
https://hf.tst.eu/model#Qwen3.5-9B-Polaris-HighIQ-THINKING-GGUF
https://hf.tst.eu/model#Qwen3.5-9B-Deckard-Claude-DIMOE-Uncensored-Heretic-Thinking-GGUF
https://hf.tst.eu/model#Qwen3.5-9B-Claude-4.6-Opus-Deckard-V4.2-Uncensored-Heretic-Thinking-GGUF
https://hf.tst.eu/model#Qwen3.5-2B-Claude-4.6-OS-Auto-Variable-HERETIC-UNCENSORED-THINKING-GGUF
for quants to appear.
thanks !
RE: Quanting ; just a heads up RE: 2 tensors.
I am going to open a ticket at llamacpp to hopefully get it standardized / part of normal quanting for Qwen 3.5s.