Instructions to use InsecureErasure/Z-Image-Turbo-NVFP4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use InsecureErasure/Z-Image-Turbo-NVFP4 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("InsecureErasure/Z-Image-Turbo-NVFP4", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
Update README.md
Browse files
README.md
CHANGED
|
@@ -103,10 +103,10 @@ convert_to_quant -i $1 \
|
|
| 103 |
|
| 104 |
| File | Description |
|
| 105 |
|---|---|
|
| 106 |
-
| `
|
| 107 |
-
| `
|
| 108 |
|
| 109 |
-
Use the LoRA
|
| 110 |
|
| 111 |
## Requirements
|
| 112 |
|
|
@@ -115,18 +115,18 @@ Use the LoRA at **1.5–2.0** strength in ComfyUI for maximum fidelity.
|
|
| 115 |
|
| 116 |
## Comparison
|
| 117 |
|
| 118 |
-
| | NVFP4 Mixed (this) |
|
| 119 |
-
|---|---|---|---|
|
| 120 |
-
|
|
| 121 |
-
|
|
| 122 |
-
|
|
| 123 |
-
|
|
| 124 |
-
|
|
| 125 |
-
|
|
| 126 |
-
|
|
| 127 |
-
|
|
| 128 |
-
|
|
| 129 |
-
|
|
| 130 |
|
| 131 |
¹ Estimated on RTX 5060 (Blackwell) with `comfy-kitchen` CUDA kernels.
|
| 132 |
|
|
|
|
| 103 |
|
| 104 |
| File | Description |
|
| 105 |
|---|---|
|
| 106 |
+
| `z_image_turbo_nvfp4.safetensors` | Quantized weights |
|
| 107 |
+
| `z_image_turbo_nvfp4_lora.safetensors` | Error-correction LoRA (rank 32) |
|
| 108 |
|
| 109 |
+
Use the LoRA with variable strength in ComfyUI for improved fidelity.
|
| 110 |
|
| 111 |
## Requirements
|
| 112 |
|
|
|
|
| 115 |
|
| 116 |
## Comparison
|
| 117 |
|
| 118 |
+
| | NVFP4 Mixed (this) | MXFP8 Uniform | Official NVFP4 |
|
| 119 |
+
| --- | --- | --- | --- |
|
| 120 |
+
| Size | 4.84 GB | 6.23 GB | 4.51 GB |
|
| 121 |
+
| Base format | NVFP4 (4-bit) | MXFP8 (8-bit) | NVFP4 (4-bit) |
|
| 122 |
+
| Custom layers | ~100 tensors → MXFP8 | None | None |
|
| 123 |
+
| BF16 exclusions | ~20 tensors | 8 patterns | Refiners fully BF16 |
|
| 124 |
+
| Learned rounding | ✅ 6000 iter | ❌ --simple | ❌ |
|
| 125 |
+
| LoRA | ✅ rank 32 | ❌ | ❌ |
|
| 126 |
+
| Refiner block 0 | MXFP8 | MXFP8 | BF16 |
|
| 127 |
+
| Late adaLN (22–29) | BF16 | BF16 | NVFP4 ⚠️ |
|
| 128 |
+
| Last QKV (layer 29) | BF16 | BF16 | NVFP4 ⚠️ |
|
| 129 |
+
| Quantization time¹ | ~60–90 min | ~5–10 min | N/A |
|
| 130 |
|
| 131 |
¹ Estimated on RTX 5060 (Blackwell) with `comfy-kitchen` CUDA kernels.
|
| 132 |
|