Quick-Quantize / plan.md
silveroxides's picture
Upload folder using huggingface_hub
cd58174 verified

Create space for quantizing models

Download and Upload

Needs code for downloading model file from a repo on hugginface using huggingface_hub

Needs code for uploading quantized model to a target repo as a pull request using huggingface_hub

Quantization options in UI

Source repo and filename for input model

Target repo and filename for output model

Quant format options in UI

int8 rowwise(add -int8mixedrow-simple to output model name): int8=True scaling_mode="row"

mxfp8(add -mxfp8mixed-simple to output model name): mxfp8=True

fp8(default and add -fp8mixed-simple to output model name): scaling_mode="tensor"

Layer filter options in UI

Anima: anima=True

Microsoft Lens: lens=True

Flux2: flux2=True

Chroma: distillation_large=True

Radiance: nerf_large=True radiance=True

WAN: wan=True

LTX-2.x: ltxv2=True

Qwen Image(should add high precision matmul option): qwen=True full_precision_matrix_mult=True

Z-Image: zimage=True zimage_refiner=True

Regular expression(String value should be free text input): exclude-layers="(substring_1|substring_2|substring_3)"

Always included

comfy_quant=True save_quant_metadata=True low_memory=True simple=True calib_samples=40960