π Project Deprecation and Model Removal Notice
Deprecation of Nunchaku-R128-SDXL-Series
After extensive technical evaluation and practical benchmarking, I have decided to deprecate and remove the Nunchaku-R128-SDXL-Series from Hugging Face within the next 30 days.
This project was originally intended to provide Rank-128, 4-bit quantized SDXL models using the Nunchaku SVDQ engine, with the goal of achieving:
- Improved generation speed
- Reduced VRAM (VRSAM) consumption
- Minimal loss in image quality
However, further investigation revealed that for SDXL, which is already relatively lightweight, these goals could not be achieved in practice.
Specifically:
- The theoretical performance benefits of 4-bit quantization are offset by fp16 conversion costs and surrounding runtime overhead.
- As a result, no consistent real-world generation speed advantage over standard fp16 workflows was observed.
- Additionally, no meaningful reduction in VRAM (VRSAM) usage could be confirmed compared to fp16 execution.
While model size reduction itself was successfully achieved, for SDXL this alone does not justify the added complexity. In terms of both generation speed and VRAM usage, the 4-bit SVDQ approach does not provide practical benefits for SDXL.
Change in Direction: fp8e4m3
During the same evaluation period, fp8e4m3-based compression was tested and showed significantly better practical characteristics for SDXL:
- Full compatibility with standard loaders
- Image quality effectively equivalent to fp16
- No generation speed degradation
- No increase in VRAM usage
- Model size comparable to Nunchaku 4-bit variants
Based on these results, I have concluded that continuing a 4-bit quantization (SVDQ) approach for SDXL is not technically justified, and that fp8e4m3 represents a more rational and maintainable solution.
Accordingly, this repository will be deprecated and removed.
Future Plans
While Nunchaku-R128-SDXL-Series itself will be removed, fp8e4m3-based SDXL models, compression scripts, and related technical documentation will continue to be released going forward.
The knowledge gained through this project will be applied in more practical and sustainable forms.
Affected Models
- All SDXL models published under ussoewwin / Nunchaku-R128-SDXL-Series
These models will be removed within one month.
Notes for Users
- The repository will remain accessible during the deprecation period for reference and evaluation.
- After removal, tags and revision history will be preserved for archival and research purposes.
- Users seeking lightweight SDXL workflows are encouraged to use fp8e4m3-based formats or standard fp16 pipelines.
Thank you for your interest and support.
β ussoewwin
...
Nunchaku R128 SDXL Series: High-Fidelity 4-bit Quantization
This repository provides a collection of high-fidelity quantized SDXL models optimized using the Nunchaku (SVDQ W4A4) engine.
Each model in this series is quantized with Rank 128 (r128). While standard quantization often uses r32 or r64, r128 is used here to ensure maximum quality preservation. This is particularly crucial for:
- Photorealistic Models: Maintaining skin textures, pores, and complex lighting.
- Illustrious/Anime Models: Preserving the high-dimensional semantic understanding and delicate line-work of the latest base models.
- ControlNet Compatibility: Ensuring that the feature maps and structural details remain intact for advanced workflows.
π Key Features
- Engine: Nunchaku SVDQ (Smooth Vertical-Diagonal Quantization)
- Precision: FP4οΌINT4/NVFP4 (4-bit Weights / 4-bit Activations)
- Rank: 128 (r128) - Significantly superior detail reconstruction compared to lower ranks.
- VRAM Optimized: Fits comfortably in 8GB-12GB VRAM without sacrificing SDXL's inherent quality.
- Performance: Native acceleration on NVIDIA RTX 30/40/50 series GPUs.
π Usage (ComfyUI)
To use these models with full features (Dual CLIP loading, LoRA support, and ControlNet compatibility), you need the Unofficial Nunchaku Loader nodes.
1. Required Custom Nodes
- Nunchaku DiT & LoRA Loader (by ussoewwin): ComfyUI-nunchaku-unofficial-loader
- Stable Diffusion WebUI Nunchaku (by ussoewwin): Stable-Diffusion-WebUI-Nunchaku
Note: This loader is specifically designed to handle SVDQ-patched UNet/DiT models and provides seamless LoRA integration.
2. Setup
- VAE: Use standard SDXL VAE (place in
models/vae/)
π¦ Available Models
| Filename | Base Model | Version | License |
|---|---|---|---|
realvisxlV50_v50_r128_svdq_fp4.safetensors |
RealVisXL V5.0 | v50.0 | CreativeML Open RAIL++-M |
waiRealCN_v10_r128_svdq_fp4.safetensors |
wai-RealCN | v10.0 | CreativeML Open RAIL++-M |
bluepencilXL_v031_r128_svdq_fp4.safetensors |
BluePencil-XL | v0.3.1 | CreativeML Open RAIL++-M |
waiIllustriousSDXL_v160_r128_svdq_fp4.safetensors |
waiIllustriousSDXL | v1.6.0 | CreativeML Open RAIL++-M |
koronemixIllustrious_v70_r128_svdq_fp4.safetensors |
koronemix-illustrious | v70.0 | CreativeML Open RAIL++-M |
novaanimeXL_v15_r128_svdq_fp4.safetensors |
Nova Anime XL | v15.0 | CreativeML Open RAIL++-M |
π Credits & License
π Special Acknowledgement
We extend our deepest respect and gratitude to the Nunchaku Team for their groundbreaking work on SVDQ quantization and for sharing their models with the community. This collection relies heavily on their research and original implementation.
- Original Repository: nunchaku-tech/nunchaku-sdxl
Base Models
These models are derivatives of their respective creators. All credit for aesthetic tuning and model training belongs to the original creators.
- RealVisXL V5.0: Created by SG_161222.
- wai-RealCN: Created by wai.
- BluePencil-XL v0.3.1: Created by blue_pen.
- waiIllustriousSDXL: Created by wai.
- koronemix-illustrious: Created by korone.
- Nova Anime XL: Created by realdos.
Software & Integration
- ComfyUI Loaders: The Nunchaku SDXL DiT Loader and LoRA Loader were developed and are maintained by ussoewwin (GitHub).
- Quantization Engine: Models quantized using the Nunchaku framework by MIT HAN Lab.
Disclaimer: These models are provided for optimization and research purposes. Please adhere to the original licenses of the base models.