Quick-Quantize / plan.md
silveroxides's picture
Upload folder using huggingface_hub
cd58174 verified
# Create space for quantizing models
## Download and Upload
Needs code for downloading model file from a repo on hugginface using huggingface_hub
Needs code for uploading quantized model to a target repo as a pull request using huggingface_hub
## Quantization options in UI
Source repo and filename for input model
Target repo and filename for output model
### Quant format options in UI
int8 rowwise(add -int8mixedrow-simple to output model name):
int8=True
scaling_mode="row"
mxfp8(add -mxfp8mixed-simple to output model name):
mxfp8=True
fp8(default and add -fp8mixed-simple to output model name):
scaling_mode="tensor"
## Layer filter options in UI
Anima:
anima=True
Microsoft Lens:
lens=True
Flux2:
flux2=True
Chroma:
distillation_large=True
Radiance:
nerf_large=True
radiance=True
WAN:
wan=True
LTX-2.x:
ltxv2=True
Qwen Image(should add high precision matmul option):
qwen=True
full_precision_matrix_mult=True
Z-Image:
zimage=True
zimage_refiner=True
Regular expression(String value should be free text input):
exclude-layers="(substring_1|substring_2|substring_3)"
## Always included
comfy_quant=True
save_quant_metadata=True
low_memory=True
simple=True
calib_samples=40960