Instructions to use WaveCut/ideogram-4-sdnq-uint4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use WaveCut/ideogram-4-sdnq-uint4 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("WaveCut/ideogram-4-sdnq-uint4", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
| license: other | |
| license_name: ideogram-4-non-commercial | |
| base_model: ideogram-ai/ideogram-4-fp8 | |
| pipeline_tag: text-to-image | |
| tags: | |
| - ideogram | |
| - text-to-image | |
| - sdnq | |
| - uint4 | |
| - diffusion | |
| - typography | |
| # Ideogram 4 FP8 -> SDNQ UInt4 | |
| This is an experimental SDNQ UInt4 conversion of `ideogram-ai/ideogram-4-fp8`. It is intended for local research and non-commercial use under the upstream Ideogram 4 license. The conversion was made from the FP8 checkpoint, materializing FP8 linears back to bf16 and then applying static SDNQ `uint4` component-by-component. | |
| The model includes SDNQ-compressed `text_encoder`, `transformer`, `unconditional_transformer`, and `vae` components. The official `ideogram4` loader does not know how to instantiate SDNQ-packed custom transformers, so this repository includes `ideogram4_sdnq_pipeline.py`. | |
| ## Usage | |
| ```python | |
| import torch | |
| from ideogram4 import PRESETS | |
| from ideogram4_sdnq_pipeline import Ideogram4SDNQPipeline | |
| pipe = Ideogram4SDNQPipeline.from_pretrained( | |
| "WaveCut/ideogram-4-sdnq-uint4", | |
| device="cuda", | |
| dtype=torch.bfloat16, | |
| ) | |
| preset = PRESETS["V4_DEFAULT_20"] | |
| image = pipe( | |
| "a typographic poster reading HELLO WORLD", | |
| height=1024, | |
| width=1024, | |
| num_steps=preset.num_steps, | |
| guidance_schedule=preset.guidance_schedule, | |
| mu=preset.mu, | |
| std=preset.std, | |
| seed=4101, | |
| raise_on_caption_issues=False, | |
| )[0] | |
| image.save("out.png") | |
| ``` | |
| Install requirements: | |
| ```bash | |
| pip install git+https://github.com/ideogram-oss/ideogram4 sdnq safetensors transformers accelerate pillow | |
| ``` | |
| ## Component Structure | |
| Upstream FP8 structure: | |
| - `text_encoder`: Qwen3-VL text path used in text-only mode. Hidden states from 13 layers are concatenated for the DiT. | |
| - `transformer`: conditional 34-layer single-stream DiT. | |
| - `unconditional_transformer`: image-only negative branch used for asymmetric CFG. | |
| - `vae`: Flux2-style KL autoencoder decoder. | |
| - `tokenizer` and `scheduler`: copied from upstream. | |
| ## Quantization | |
| | Component | Source materialized MB | SDNQ state MB | Quantize s | Quant peak nvidia MB | | |
| | --- | --- | --- | --- | --- | | |
| | transformer | 17698.84 | 4979.66 | 112.64 | 36525.00 | | |
| | unconditional_transformer | 17698.84 | 4979.66 | 108.68 | 36525.00 | | |
| | text_encoder | 14435.59 | 4097.53 | 102.32 | 24477.00 | | |
| | vae | 160.31 | 50.19 | 2.68 | 861.00 | | |
| ## Benchmark | |
| Hardware: RunPod NVIDIA RTX PRO 6000 Blackwell Server Edition, single process, concurrency 1. Generation used 10 structured JSON prompts at 1024x1024 with `V4_DEFAULT_20`. | |
| The FP8 baseline was loaded through the upstream `ideogram4` `Ideogram4Pipeline.from_pretrained` recipe with `weights_repo="ideogram-ai/ideogram-4-fp8"`; magic-prompt expansion was disabled because the prompts are already structured captions. | |
| | Variant | Load s | Load peak reserved MB | Load peak nvidia MB | Cold request s | Hot mean s | Gen peak reserved MB | Gen peak nvidia MB | | |
| | --- | --- | --- | --- | --- | --- | --- | --- | | |
| | original | 267.83 | 28198.00 | 28759.00 | 17.90 | 17.51 | 34430.00 | 35099.00 | | |
| | sdnq | 239.46 | 14558.00 | 15109.00 | 18.56 | 16.52 | 21650.00 | 22321.00 | | |
| ## Example Matrix | |
| The matrix below keeps the original FP8 and SDNQ UInt4 outputs side by side in narrow vertical columns. It is a WebP at quality 95. | |
|  | |
| ## Prompt Set | |
| | # | id | summary | | |
| | --- | --- | --- | | |
| | 1 | `editorial_watch_photo` | A photorealistic editorial product photograph of a transparent mechanical wristwatch resting on a wet black stone slab, with tiny engraved labels visible on the dial. | | |
| | 2 | `risograph_botanical_poster` | A layered risograph botanical exhibition poster with bold overprint textures and clean typographic hierarchy. | | |
| | 3 | `cyrillic_cafe_menu` | A cozy Moscow cafe menu board photographed straight-on, testing clean Cyrillic typography in chalk and printed labels. | | |
| | 4 | `brutalist_architecture` | A cinematic architectural photograph of a brutalist library atrium with tiny wayfinding signs and people for scale. | | |
| | 5 | `ink_manga_rain` | A dramatic black-and-white manga splash page of a courier cycling through rain, with sound effects and shop signage. | | |
| | 6 | `museum_clay_render` | A polished 3D clay render of a museum diorama showing a future Arctic research station with labeled miniature modules. | | |
| | 7 | `food_packaging_label` | A realistic premium chocolate bar packaging mockup with layered foil, embossed typography, and ingredient microcopy. | | |
| | 8 | `fantasy_map_typography` | A hand-painted fantasy map on parchment with readable place names, compass ornament, and coastal illustrations. | | |
| | 9 | `streetwear_lookbook` | A fashion lookbook cover photograph for a streetwear collection, with crisp cover typography and realistic fabric textures. | | |
| | 10 | `scientific_cutaway` | A detailed scientific cutaway illustration of a compact fusion battery prototype with annotated parts and clean technical typography. | | |
| ## Files | |
| - `prompts.json`: the 10 structured prompts used for the comparison. | |
| - `assets/original_vs_sdnq_vertical.webp`: vertical side-by-side WebP comparison matrix for original FP8 vs SDNQ UInt4, quality 95. | |
| - `assets/sdnq_vs_nf4_4090_vertical.webp`: vertical side-by-side WebP comparison matrix for the RTX 4090 SDNQ vs official NF4 follow-up, quality 95. | |
| - `benchmark/`: raw benchmark JSONL/CSV files and `summary.json`. | |
| - `quantization_manifest.json`: component-level quantization timings, storage, and VRAM peaks. | |
| - `ideogram4_sdnq_pipeline.py`: loader helper for the SDNQ custom transformer components. | |
| ## RTX 4090 Follow-up: SDNQ UInt4 vs Official NF4 | |
| Hardware: RunPod NVIDIA GeForce RTX 4090, 24 GB VRAM, single process, concurrency 1. Both variants used the same 10 structured captions from `prompts.json`, 1024x1024, `V4_DEFAULT_20`, and no magic-prompt expansion. `nf4` uses the official `ideogram-ai/ideogram-4-nf4` checkpoint through the upstream `ideogram4` loader. | |
| | Variant | Cases | Load s | Load peak reserved MB | Load peak nvidia MB | Cold request s | Hot mean s | Hot max s | Gen peak reserved MB | Gen peak nvidia MB | | |
| | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | |
| | sdnq | 10.00 | 211.61 | 14124.00 | 14466.00 | 59.65 | 37.05 | 37.57 | 19768.00 | 20521.00 | | |
| | nf4 | 10.00 | 269.31 | 15370.00 | 15766.00 | 36.57 | 36.31 | 36.77 | 21012.00 | 21801.00 | | |
|  | |
| Raw follow-up metrics are in `benchmark/summary_4090_sdnq_vs_nf4.json`, `benchmark/sdnq_4090_metrics.*`, and `benchmark/nf4_4090_metrics.*`. The exact runner used for the follow-up is `benchmark/followup_runner.py`. | |
| ## License | |
| This checkpoint is derived from `ideogram-ai/ideogram-4-fp8` and follows the upstream Ideogram 4 non-commercial license. See `LICENSE.md`. | |