πŸ“Œ Project Deprecation and Model Removal Notice

Deprecation of Nunchaku-R128-SDXL-Series

After extensive technical evaluation and practical benchmarking, I have decided to deprecate and remove the Nunchaku-R128-SDXL-Series from Hugging Face within the next 30 days.

This project was originally intended to provide Rank-128, 4-bit quantized SDXL models using the Nunchaku SVDQ engine, with the goal of achieving:

  • Improved generation speed
  • Reduced VRAM (VRSAM) consumption
  • Minimal loss in image quality

However, further investigation revealed that for SDXL, which is already relatively lightweight, these goals could not be achieved in practice.

Specifically:

  • The theoretical performance benefits of 4-bit quantization are offset by fp16 conversion costs and surrounding runtime overhead.
  • As a result, no consistent real-world generation speed advantage over standard fp16 workflows was observed.
  • Additionally, no meaningful reduction in VRAM (VRSAM) usage could be confirmed compared to fp16 execution.

While model size reduction itself was successfully achieved, for SDXL this alone does not justify the added complexity. In terms of both generation speed and VRAM usage, the 4-bit SVDQ approach does not provide practical benefits for SDXL.


Change in Direction: fp8e4m3

During the same evaluation period, fp8e4m3-based compression was tested and showed significantly better practical characteristics for SDXL:

  • Full compatibility with standard loaders
  • Image quality effectively equivalent to fp16
  • No generation speed degradation
  • No increase in VRAM usage
  • Model size comparable to Nunchaku 4-bit variants

Based on these results, I have concluded that continuing a 4-bit quantization (SVDQ) approach for SDXL is not technically justified, and that fp8e4m3 represents a more rational and maintainable solution.

Accordingly, this repository will be deprecated and removed.


Future Plans

While Nunchaku-R128-SDXL-Series itself will be removed, fp8e4m3-based SDXL models, compression scripts, and related technical documentation will continue to be released going forward.

The knowledge gained through this project will be applied in more practical and sustainable forms.


Affected Models

  • All SDXL models published under ussoewwin / Nunchaku-R128-SDXL-Series

These models will be removed within one month.


Notes for Users

  • The repository will remain accessible during the deprecation period for reference and evaluation.
  • After removal, tags and revision history will be preserved for archival and research purposes.
  • Users seeking lightweight SDXL workflows are encouraged to use fp8e4m3-based formats or standard fp16 pipelines.

Thank you for your interest and support.

β€” ussoewwin


...

Nunchaku R128 SDXL Series: High-Fidelity 4-bit Quantization

This repository provides a collection of high-fidelity quantized SDXL models optimized using the Nunchaku (SVDQ W4A4) engine.

Each model in this series is quantized with Rank 128 (r128). While standard quantization often uses r32 or r64, r128 is used here to ensure maximum quality preservation. This is particularly crucial for:

  • Photorealistic Models: Maintaining skin textures, pores, and complex lighting.
  • Illustrious/Anime Models: Preserving the high-dimensional semantic understanding and delicate line-work of the latest base models.
  • ControlNet Compatibility: Ensuring that the feature maps and structural details remain intact for advanced workflows.

πŸš€ Key Features

  • Engine: Nunchaku SVDQ (Smooth Vertical-Diagonal Quantization)
  • Precision: FP4οΌ†INT4/NVFP4 (4-bit Weights / 4-bit Activations)
  • Rank: 128 (r128) - Significantly superior detail reconstruction compared to lower ranks.
  • VRAM Optimized: Fits comfortably in 8GB-12GB VRAM without sacrificing SDXL's inherent quality.
  • Performance: Native acceleration on NVIDIA RTX 30/40/50 series GPUs.

πŸ›  Usage (ComfyUI)

To use these models with full features (Dual CLIP loading, LoRA support, and ControlNet compatibility), you need the Unofficial Nunchaku Loader nodes.

1. Required Custom Nodes

2. Setup

  • VAE: Use standard SDXL VAE (place in models/vae/)

πŸ“¦ Available Models

Filename Base Model Version License
realvisxlV50_v50_r128_svdq_fp4.safetensors RealVisXL V5.0 v50.0 CreativeML Open RAIL++-M
waiRealCN_v10_r128_svdq_fp4.safetensors wai-RealCN v10.0 CreativeML Open RAIL++-M
bluepencilXL_v031_r128_svdq_fp4.safetensors BluePencil-XL v0.3.1 CreativeML Open RAIL++-M
waiIllustriousSDXL_v160_r128_svdq_fp4.safetensors waiIllustriousSDXL v1.6.0 CreativeML Open RAIL++-M
koronemixIllustrious_v70_r128_svdq_fp4.safetensors koronemix-illustrious v70.0 CreativeML Open RAIL++-M
novaanimeXL_v15_r128_svdq_fp4.safetensors Nova Anime XL v15.0 CreativeML Open RAIL++-M

πŸ“œ Credits & License

πŸ† Special Acknowledgement

We extend our deepest respect and gratitude to the Nunchaku Team for their groundbreaking work on SVDQ quantization and for sharing their models with the community. This collection relies heavily on their research and original implementation.

Base Models

These models are derivatives of their respective creators. All credit for aesthetic tuning and model training belongs to the original creators.

  • RealVisXL V5.0: Created by SG_161222.
  • wai-RealCN: Created by wai.
  • BluePencil-XL v0.3.1: Created by blue_pen.
  • waiIllustriousSDXL: Created by wai.
  • koronemix-illustrious: Created by korone.
  • Nova Anime XL: Created by realdos.

Software & Integration

  • ComfyUI Loaders: The Nunchaku SDXL DiT Loader and LoRA Loader were developed and are maintained by ussoewwin (GitHub).
  • Quantization Engine: Models quantized using the Nunchaku framework by MIT HAN Lab.

Disclaimer: These models are provided for optimization and research purposes. Please adhere to the original licenses of the base models.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support