Request for UD versions of larger quants like Q6_K

by mingyi456 - opened 18 days ago

18 days ago

Thanks for extending your UD quants to diffusion models.

But on my 4090, Q8_0 for qwen image appears to take up a bit too much vram, which makes my system unresponsive when I am using it. But Q6_K feels like a bit of a waste considering the unused vram allocation. Could you guys upload a UD Q6_K that is in between the Q6 and Q8 in size as well?

danielhanchen

Unsloth AI org 17 days ago

Thank you for the support! We'll see what we can do, we're still iterating on our methodology and will need to do more testing so we can include your feedback in it!

CHNtentes

17 days ago

I'm wondering why the diffusion GGUFs aren't called as UD_Qn_K_XL?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment