| --- |
| language: |
| - en |
| license: other |
| license_name: quantfunc-model-license |
| tags: |
| - image-generation |
| - text-to-image |
| - diffusion |
| - quantized |
| - quantfunc |
| --- |
| |
| # QuantFunc |
|
|
| <div align="center" style="margin-top: 50px;"> |
| <img src="assets/logo.webp" width="300" alt="Logo"> |
| </div> |
|
|
| # Qwen-Image-Series |
|
|
| Pre-quantized **Qwen-Image-2512** text-to-image model series by [QuantFunc](https://github.com/user/quantfunc), with Lighting backend inference support. |
|
|
| ## Overview |
|
|
| Qwen-Image-2512 is a text-to-image diffusion model distilled from Alibaba Qwen team's image generation model. |
|
|
| With the latest QuantFunc ComfyUI plugin, inference achieves **2xβ11x speedup** over mainstream frameworks. |
|
|
| ## Hardware Requirements |
|
|
| - Supports NVIDIA RTX 30 series and above |
| - RTX 20 series does not support BF16, which causes significant precision loss in Qwen series model quantization scenarios. Therefore, the 20 series currently only supports Z-Image models. |
|
|
| ## Compatibility |
|
|
| - The base models in this repository are compatible with **any version** of Qwen-Image transformer weights |
| - The QuantFunc code plugin and ComfyUI plugin are **100% compatible** with previous versions of Qwen-Image models |
|
|
| ## Directory Structure |
|
|
| ``` |
| Qwen-Image-Series/ |
| βββ qwen-image-series-50x-above-base-model/ # Base model, optimized for RTX 50 series and above |
| β βββ text_encoder/ # Qwen2.5-VL text encoder (pre-quantized) |
| β βββ vae/ # 3D VAE decoder (~242MB) |
| β βββ tokenizer/ # Tokenizer |
| β βββ scheduler/ # Scheduler config |
| β βββ model_index.json |
| β βββ quantfunc_config.json |
| βββ qwen-image-series-50x-below-base-model/ # Base model, optimized for RTX 50 series and below |
| β βββ (same structure as above) |
| βββ transformer/ |
| β βββ config.json |
| β βββ qwen-image-2512-50x-above-lighting-4steps.safetensors # RTX 50+ Lighting 4-step (~14GB) |
| β βββ qwen-image-2512-50x-above-lighting-4steps-prequant.safetensors # RTX 50+ Lighting pre-quantized (~11GB) |
| β βββ qwen-image-2512-50x-below-lighting-4steps.safetensors # RTX 30/40 Lighting 4-step (~14GB) |
| β βββ qwen-image-2512-50x-below-lighting-4steps-prequant.safetensors # RTX 30/40 Lighting pre-quantized (~11GB) |
| βββ prequant/ # Pre-quantized modulation weights |
| β βββ qwen-image-2512-50x-above.safetensors # RTX 50+ mod weights (2512) |
| β βββ qwen-image-2512-50x-below.safetensors # RTX 30/40 mod weights (2512) |
| β βββ qwen-image-50x-above.safetensors # RTX 50+ mod weights (legacy) |
| β βββ qwen-image-50x-below.safetensors # RTX 30/40 mod weights (legacy) |
| βββ precision-config/ # Lighting precision config samples |
| βββ 50x-above-fp4-sample.json # FP4 config for RTX 50+ |
| βββ 50x-below-int4-sample.json # INT4 config for RTX 30/40 |
| ``` |
|
|
| ## Model Variants |
|
|
| | Variant | base-model | transformer | Total Size | Target GPU | |
| |---------|-----------|-------------|------------|------------| |
| | **50x-above** | `qwen-image-series-50x-above-base-model` | `qwen-image-2512-50x-above-lighting-4steps.safetensors` | ~14GB | RTX 50 series and above | |
| | **50x-below** | `qwen-image-series-50x-below-base-model` | `qwen-image-2512-50x-below-lighting-4steps.safetensors` | ~14GB | RTX 30/40 series | |
|
|
| - **50x-above**: Optimized for RTX 50 series (Blackwell) and above |
| - **50x-below**: Optimized for RTX 30/40 series |
| - **4steps**: Distilled accelerated version β only 4 steps needed to generate images |
|
|
| > The base-model and transformer must use the **same variant** (both above or both below). |
|
|
| ## Quick Start |
|
|
| ### Download |
|
|
| ```bash |
| pip install modelscope |
| ``` |
|
|
| ```python |
| from modelscope import snapshot_download |
| model_dir = snapshot_download('QuantFunc/Qwen-Image-Series') |
| ``` |
|
|
| ### Inference |
|
|
| ```bash |
| # RTX 50 series |
| quantfunc \ |
| --model-dir Qwen-Image-Series/qwen-image-series-50x-above-base-model \ |
| --transformer Qwen-Image-Series/transformer/qwen-image-2512-50x-above-lighting-4steps.safetensors \ |
| --auto-optimize --model-backend lighting \ |
| --prompt "a beautiful sunset over the ocean with dramatic clouds" \ |
| --output output.png --steps 4 |
| |
| # RTX 30/40 series |
| quantfunc \ |
| --model-dir Qwen-Image-Series/qwen-image-series-50x-below-base-model \ |
| --transformer Qwen-Image-Series/transformer/qwen-image-2512-50x-below-lighting-4steps.safetensors \ |
| --auto-optimize --model-backend lighting \ |
| --prompt "a beautiful sunset over the ocean with dramatic clouds" \ |
| --output output.png --steps 4 |
| ``` |
|
|
| `--auto-optimize` automatically configures VRAM management, attention backend, and offload strategies based on your GPU. |
|
|
| ## SVDQ && Lighting Backend |
|
|
| This repository provides **Lighting** backend models. Differences between the two backends: |
|
|
| | Feature | Lighting | SVDQ | |
| |---------|----------|------| |
| | **Quantization** | Per-layer mixed precision (FP4/INT4/FP8/INT8) | Nunchaku-based holistic pre-quantization | |
| | **LoRA Integration** | Real-time quantization β build a custom model in 5 minutes with zero speed loss, integrating any number of LoRAs | Runtime low-rank pathway | |
| | **Ecosystem** | QuantFunc native | Compatible with the widely-adopted Nunchaku ecosystem, enhanced with Rotation quantization and Auto Rank dynamic rank optimization | |
| | **Flexibility** | Per-layer/sub-layer precision control | Precision fixed at export time | |
| | **Use Cases** | Rapid personal model customization, batch LoRA integration | Leverage Nunchaku ecosystem, runtime dynamic LoRA | |
|
|
| ## Pre-quantized Modulation Weights (prequant/) |
|
|
| The `prequant/` directory contains **pre-quantized modulation weights** extracted from SVDQ models. Use them with the Lighting backend for high-quality modulation without runtime quantization overhead. |
|
|
| ```bash |
| # From FP16 with mod weights (first run quantizes on-the-fly) |
| quantfunc \ |
| --model-dir Qwen-Image-Series/qwen-image-series-50x-above-base-model \ |
| --model-backend lighting \ |
| --precision-config Qwen-Image-Series/precision-config/50x-above-fp4-sample.json \ |
| --mod-weights Qwen-Image-Series/prequant/qwen-image-2512-50x-above.safetensors \ |
| --rotation-block-size 256 \ |
| --prompt "a beautiful sunset" --steps 4 --auto-optimize |
| ``` |
|
|
| Alternatively, use the **pre-quantized Lighting transformer** for instant loading (no runtime quantization): |
|
|
| ```bash |
| quantfunc \ |
| --model-dir Qwen-Image-Series/qwen-image-series-50x-above-base-model \ |
| --transformer Qwen-Image-Series/transformer/qwen-image-2512-50x-above-lighting-4steps-prequant.safetensors \ |
| --model-backend lighting \ |
| --prompt "a beautiful sunset" --steps 4 --auto-optimize |
| ``` |
|
|
| ## Precision Config (precision-config/) |
|
|
| Sample per-layer precision configurations for the Lighting backend: |
|
|
| | File | Target GPU | Precision | |
| |------|-----------|-----------| |
| | `50x-above-fp4-sample.json` | RTX 50+ | FP4 attention + AF8WF4 MLP fc2 + INT8 modulation | |
| | `50x-below-int4-sample.json` | RTX 30/40 | INT4 all layers + INT8 modulation | |
|
|
| ## Related Repositories |
|
|
| - [QuantFunc/Z-Image-Series](https://modelscope.cn/models/QuantFunc/Z-Image-Series) β Z-Image-Turbo text-to-image (lightweight, fast) |
| - [QuantFunc/Qwen-Image-Edit-Series](https://modelscope.cn/models/QuantFunc/Qwen-Image-Edit-Series) β Qwen-Image-Edit image editing |
|
|
| ## License |
|
|
| The pre-quantized model weights in this repository are derived from the original models. Users must comply with the original model's license agreement. The QuantFunc inference engine and its plugins (including the ComfyUI plugin) are licensed separately β see official QuantFunc channels for details. |
|
|
| For models quantized from commercially licensed models, users are responsible for obtaining the necessary commercial licenses from the original model providers. |
|
|
|
|