Lower spectral quants possible? Q2/Q3?

#1
by sbeltz - opened

Could you quantize Qwen3.6 27B or Qwen3.6 35B A3B to Q2_K or Q3_K sizes (<13B) where they would fit comfortably on 16GB VRAM? That would be a game changer for local inference, and a quite popular demonstration that your methods hold up to agentic coding use cases!

Sign up or log in to comment