--- library_name: diffusers pipeline_tag: unconditional-image-generation tags: - diffusers - deco - image-generation - class-conditional - imagenet license: mit inference: true widget: - text: golden retriever output: url: DeCo-XL-16-512/demo.png language: - en --- # DeCo-diffusers Diffusers-ready checkpoints for **DeCo** (Decoupled Conditioning), converted for local/offline use. This root folder is a model collection that contains: - `DeCo-XL-16-256` - `DeCo-XL-16-512` - `DeCo-XXL-16-512-t2i` (text-to-image; requires `Qwen/Qwen3-1.7B` text encoder) Each subfolder is a self-contained Diffusers model repo with: - `pipeline.py` - `transformer/transformer_deco.py` - `scheduler/scheduling_deco_flow_match_euler_discrete.py` - `transformer/diffusion_pytorch_model.safetensors` - `vae/autoencoder_deco.py` Each variant embeds English `id2label` directly in `model_index.json` (DiT-style), so class labels can be passed as ImageNet ids or English synonym strings. - `pipe.id2label` — id → English label (comma-separated synonyms) - `pipe.get_label_ids("golden retriever")` — English label → id ## Demo ![DeCo-XL-16-512 demo](DeCo-XL-16-512/demo.png) Class-conditional sample (ImageNet class **207**, golden retriever), `DeCo-XL/16` at 512×512, 100 steps, CFG 5.0, seed 42. ## Model Paths Use paths relative to this root README: | Model | Resolution | Source checkpoint | Local path | | --- | ---: | --- | --- | | DeCo-XL/16 | 256×256 | `imagenet256_epoch800.ckpt` (EMA) | `./DeCo-XL-16-256` | | DeCo-XL/16 | 512×512 | `imagenet512_epoch340.ckpt` (EMA) | `./DeCo-XL-16-512` | | DeCo-XXL/16 | 512×512 t2i | `t2i_DeCo.ckpt` (EMA) | `./DeCo-XXL-16-512-t2i` | ## Inference Demo (Diffusers) ### 1) Load a local subfolder checkpoint ```python import torch from diffusers import DiffusionPipeline model_path = "./DeCo-XL-16-512" # change to ./DeCo-XL-16-256 for 256px device = "cuda" if torch.cuda.is_available() else "cpu" pipe = DiffusionPipeline.from_pretrained( model_path, trust_remote_code=True, torch_dtype=torch.bfloat16, ).to(device) generator = torch.Generator(device=device).manual_seed(42) # ImageNet class example: 207 = golden retriever print(pipe.id2label[207]) print(pipe.get_label_ids("golden retriever")) # [207] result = pipe( class_labels="golden retriever", num_inference_steps=100, guidance_scale=5.0, # use 3.2 for DeCo-XL-16-256 generator=generator, ) image = result.images[0] image.save("deco_xl_512_demo.png") ``` ### 2) Quick variant switch (256 model) ```python model_path = "./DeCo-XL-16-256" pipe = DiffusionPipeline.from_pretrained(model_path, trust_remote_code=True).to(device) image = pipe( class_labels=207, num_inference_steps=100, guidance_scale=3.2, generator=generator, ).images[0] image.save("deco_xl_256_demo.png") ``` Integer class ids, batched labels, and optional `batch_size` for repeating a single label are also supported. ### 3) Text-to-image (`DeCo-XXL-16-512-t2i` / `t2i_DeCo.ckpt`) Use the **AdamLM** scheduler defaults from official DeCo (not the c2i 100-step / CFG 5.0 recipe): ```python import torch from diffusers import DiffusionPipeline model_path = "./DeCo-XXL-16-512-t2i" device = "cuda" if torch.cuda.is_available() else "cpu" pipe = DiffusionPipeline.from_pretrained( model_path, trust_remote_code=True, custom_pipeline=f"{model_path}/pipeline.py", torch_dtype=torch.bfloat16, ).to(device) # Bundled ./text_encoder (Qwen3-1.7B weights + tokenizer). Pipeline loads both from that folder. # Denoiser runs in float32 during __call__ (matches official GenEval predict). image = pipe( prompt="a golden retriever playing in the snow, high quality photograph", negative_prompt="Unrealistic, JPEG artifacts.", num_inference_steps=25, guidance_scale=4.0, timeshift=3.0, generator=torch.Generator(device="cpu").manual_seed(42), ).images[0] image.save("deco_t2i_demo.png") ```