Instructions to use WaveCut/ideogram-4-sdnq-uint4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use WaveCut/ideogram-4-sdnq-uint4 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("WaveCut/ideogram-4-sdnq-uint4", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
File size: 6,822 Bytes
f3d279e ea2e674 f3d279e ea2e674 f3d279e ea2e674 f3d279e 98ad5d3 ea2e674 98ad5d3 f3d279e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | ---
license: other
license_name: ideogram-4-non-commercial
base_model: ideogram-ai/ideogram-4-fp8
pipeline_tag: text-to-image
tags:
- ideogram
- text-to-image
- sdnq
- uint4
- diffusion
- typography
---
# Ideogram 4 FP8 -> SDNQ UInt4
This is an experimental SDNQ UInt4 conversion of `ideogram-ai/ideogram-4-fp8`. It is intended for local research and non-commercial use under the upstream Ideogram 4 license. The conversion was made from the FP8 checkpoint, materializing FP8 linears back to bf16 and then applying static SDNQ `uint4` component-by-component.
The model includes SDNQ-compressed `text_encoder`, `transformer`, `unconditional_transformer`, and `vae` components. The official `ideogram4` loader does not know how to instantiate SDNQ-packed custom transformers, so this repository includes `ideogram4_sdnq_pipeline.py`.
## Usage
```python
import torch
from ideogram4 import PRESETS
from ideogram4_sdnq_pipeline import Ideogram4SDNQPipeline
pipe = Ideogram4SDNQPipeline.from_pretrained(
"WaveCut/ideogram-4-sdnq-uint4",
device="cuda",
dtype=torch.bfloat16,
)
preset = PRESETS["V4_DEFAULT_20"]
image = pipe(
"a typographic poster reading HELLO WORLD",
height=1024,
width=1024,
num_steps=preset.num_steps,
guidance_schedule=preset.guidance_schedule,
mu=preset.mu,
std=preset.std,
seed=4101,
raise_on_caption_issues=False,
)[0]
image.save("out.png")
```
Install requirements:
```bash
pip install git+https://github.com/ideogram-oss/ideogram4 sdnq safetensors transformers accelerate pillow
```
## Component Structure
Upstream FP8 structure:
- `text_encoder`: Qwen3-VL text path used in text-only mode. Hidden states from 13 layers are concatenated for the DiT.
- `transformer`: conditional 34-layer single-stream DiT.
- `unconditional_transformer`: image-only negative branch used for asymmetric CFG.
- `vae`: Flux2-style KL autoencoder decoder.
- `tokenizer` and `scheduler`: copied from upstream.
## Quantization
| Component | Source materialized MB | SDNQ state MB | Quantize s | Quant peak nvidia MB |
| --- | --- | --- | --- | --- |
| transformer | 17698.84 | 4979.66 | 112.64 | 36525.00 |
| unconditional_transformer | 17698.84 | 4979.66 | 108.68 | 36525.00 |
| text_encoder | 14435.59 | 4097.53 | 102.32 | 24477.00 |
| vae | 160.31 | 50.19 | 2.68 | 861.00 |
## Benchmark
Hardware: RunPod NVIDIA RTX PRO 6000 Blackwell Server Edition, single process, concurrency 1. Generation used 10 structured JSON prompts at 1024x1024 with `V4_DEFAULT_20`.
The FP8 baseline was loaded through the upstream `ideogram4` `Ideogram4Pipeline.from_pretrained` recipe with `weights_repo="ideogram-ai/ideogram-4-fp8"`; magic-prompt expansion was disabled because the prompts are already structured captions.
| Variant | Load s | Load peak reserved MB | Load peak nvidia MB | Cold request s | Hot mean s | Gen peak reserved MB | Gen peak nvidia MB |
| --- | --- | --- | --- | --- | --- | --- | --- |
| original | 267.83 | 28198.00 | 28759.00 | 17.90 | 17.51 | 34430.00 | 35099.00 |
| sdnq | 239.46 | 14558.00 | 15109.00 | 18.56 | 16.52 | 21650.00 | 22321.00 |
## Example Matrix
The matrix below keeps the original FP8 and SDNQ UInt4 outputs side by side in narrow vertical columns. It is a WebP at quality 95.

## Prompt Set
| # | id | summary |
| --- | --- | --- |
| 1 | `editorial_watch_photo` | A photorealistic editorial product photograph of a transparent mechanical wristwatch resting on a wet black stone slab, with tiny engraved labels visible on the dial. |
| 2 | `risograph_botanical_poster` | A layered risograph botanical exhibition poster with bold overprint textures and clean typographic hierarchy. |
| 3 | `cyrillic_cafe_menu` | A cozy Moscow cafe menu board photographed straight-on, testing clean Cyrillic typography in chalk and printed labels. |
| 4 | `brutalist_architecture` | A cinematic architectural photograph of a brutalist library atrium with tiny wayfinding signs and people for scale. |
| 5 | `ink_manga_rain` | A dramatic black-and-white manga splash page of a courier cycling through rain, with sound effects and shop signage. |
| 6 | `museum_clay_render` | A polished 3D clay render of a museum diorama showing a future Arctic research station with labeled miniature modules. |
| 7 | `food_packaging_label` | A realistic premium chocolate bar packaging mockup with layered foil, embossed typography, and ingredient microcopy. |
| 8 | `fantasy_map_typography` | A hand-painted fantasy map on parchment with readable place names, compass ornament, and coastal illustrations. |
| 9 | `streetwear_lookbook` | A fashion lookbook cover photograph for a streetwear collection, with crisp cover typography and realistic fabric textures. |
| 10 | `scientific_cutaway` | A detailed scientific cutaway illustration of a compact fusion battery prototype with annotated parts and clean technical typography. |
## Files
- `prompts.json`: the 10 structured prompts used for the comparison.
- `assets/original_vs_sdnq_vertical.webp`: vertical side-by-side WebP comparison matrix for original FP8 vs SDNQ UInt4, quality 95.
- `assets/sdnq_vs_nf4_4090_vertical.webp`: vertical side-by-side WebP comparison matrix for the RTX 4090 SDNQ vs official NF4 follow-up, quality 95.
- `benchmark/`: raw benchmark JSONL/CSV files and `summary.json`.
- `quantization_manifest.json`: component-level quantization timings, storage, and VRAM peaks.
- `ideogram4_sdnq_pipeline.py`: loader helper for the SDNQ custom transformer components.
## RTX 4090 Follow-up: SDNQ UInt4 vs Official NF4
Hardware: RunPod NVIDIA GeForce RTX 4090, 24 GB VRAM, single process, concurrency 1. Both variants used the same 10 structured captions from `prompts.json`, 1024x1024, `V4_DEFAULT_20`, and no magic-prompt expansion. `nf4` uses the official `ideogram-ai/ideogram-4-nf4` checkpoint through the upstream `ideogram4` loader.
| Variant | Cases | Load s | Load peak reserved MB | Load peak nvidia MB | Cold request s | Hot mean s | Hot max s | Gen peak reserved MB | Gen peak nvidia MB |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| sdnq | 10.00 | 211.61 | 14124.00 | 14466.00 | 59.65 | 37.05 | 37.57 | 19768.00 | 20521.00 |
| nf4 | 10.00 | 269.31 | 15370.00 | 15766.00 | 36.57 | 36.31 | 36.77 | 21012.00 | 21801.00 |

Raw follow-up metrics are in `benchmark/summary_4090_sdnq_vs_nf4.json`, `benchmark/sdnq_4090_metrics.*`, and `benchmark/nf4_4090_metrics.*`. The exact runner used for the follow-up is `benchmark/followup_runner.py`.
## License
This checkpoint is derived from `ideogram-ai/ideogram-4-fp8` and follows the upstream Ideogram 4 non-commercial license. See `LICENSE.md`.
|