SFW Quantized Models

by dani2191 - opened Nov 15

Nov 15

First of all, thak you for your work. I've been using the Q3 models with my 8GB 3060ti these past few days with quite good results.
Due to my HW limitations, and use case, I wanted to make my own quantized model for the SFW version of Remix and stumbled upon the Comfyui-GGUF Tools + Llama-Quantize convert scripts. I've only used it for my own SDXL models quantizations for now.
I was wondering if those were enough for quantizing Wan 2.2 models (docs only mentions Wan 2.1) or if you do it in another way.

BigDannyPt

Owner Nov 16

I've also use the tools from gguf custom node to create these, the only thing was that I had the ZLUDA owner, patientx, provide the scripts that works with amd cards with the ZLUDA repo.
I've also created a script that would run the once to create multiple quantize versions in one run.
Just an heads up, you will need a total of 85Gb RAM ( I have 32Gb + 50 GB pagefile) to GGUF Wan 2.2 models.

dani2191

Nov 16

Great. I think I'll manage. I can always make a jupyter notebook for Colab if I need to.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment