Discrepancy between number of transformer layers in config and paper

#33

by Sahiljain314 - opened Jul 11, 2023

Jul 11, 2023

I noticed that the config.json for the SDXL UNET contains the following: https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9/blob/main/unet/config.json#L59, which indicates there is 1 transformer block at the highest resolution mapping.

However, when reading the SDXL paper, they make a bit point to mention that the actual transformer blocks are [0, 2, 10], and they have omitted any blocks at the highest level.

Am I missing something? If not, which one is correct?

furusu

Jul 12, 2023

Since there is no Transformer layer in DownBlock2D, the first term is ignored.
https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9/blob/025709258a55cc924dc47efd88959f18ae79830e/unet/config.json#L27

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment