nf4 build: visible gradient banding + vertical streaks in smooth skies

#5
by PierrunoYT - opened

Running ideogram-ai/ideogram-4-nf4 locally via Ideogram4Pipeline (diffusers, bf16, weights kept nf4-quantized, eager - no AOTI), I'm seeing color banding and faint vertical streaks in large smooth gradient regions (e.g. golden-hour skies). They're present in the raw saved PNG at native 1024x1024, not just on-screen scaling.

  • Model: ideogram-ai/ideogram-4-nf4
  • diffusers: PR #13860 head (04b197e...), transformers 5.8.0, bitsandbytes (nf4)
  • GPU: NVIDIA (CUDA), Windows
  • Steps: Default 20 (also tried higher); guidance schedule 7.0 -> 3.0

Questions:

  1. Is this banding expected from the nf4 quantization, and does the fp8 build improve smooth-gradient fidelity?
  2. Any recommended settings (steps, guidance, dtype) to reduce banding?
  3. Is output dithering / higher-precision VAE decode advisable?

Repro prompt (JSON caption via local Qwen upsampler): "a movie poster for 'THE LAST SUMMER' with dramatic golden-hour backlighting".

ideogram4_1512970472

Update: root cause likely matches the diffusers MRoPE inv_freq float32->bfloat16 downcast (see ideogram4 issue #24).

I verified on the installed diffusers Ideogram4Pipeline (nf4, orch_dtype=torch.bfloat16, CUDA/Windows):

  • In diffusers/models/transformers/transformer_ideogram4.py, inv_freq is built in float32 (persistent=False), but Ideogram4MRoPE(...).to(torch.bfloat16) downcasts it to bfloat16. rom_pretrained(torch_dtype=torch.bfloat16) triggers exactly this.
  • orward() re-casts to float32, but only after precision was already lost, so it does not help.
  • IMAGE_POSITION_OFFSET = 65536 (vs 10000 in the ideogram4 inference repo), so the error is larger here.

Measured impact at position 65536 with bf16 inv_freq:

  • max phase error ~58.8 rad (>9 full rotations)
  • 12/64 frequencies have >pi rad error; 28/64 have >0.1 rad error

So part of what I reported here is likely this positional-encoding corruption, fixable by restoring inv_freq in float32 after load. (The smooth-gradient banding/streaks may be a separate 4-bit quantization effect.)

Ref: https://github.com/ideogram-oss/ideogram4/issues/24

Sign up or log in to comment