Q4_0 Like Qwen3.5-35B-A3B
#3
by
engrtipusultan - opened
Would it be possible for you to add Q4_0 like you did for Qwen3.5-35B-A3B which is fast on vulkan and mainline llama.cpp. I think it will be useful for many people.
https://huggingface.co/ubergarm/Qwen3.5-35B-A3B-GGUF#q4_0-19776-gib-4901-bpw
Also, is there any reason you did not bump token_embd.weight to Q8