Unconditional Image Generation
Diffusers
Safetensors
English
deco
image-generation
class-conditional
imagenet
Instructions to use BiliSakura/DeCo-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/DeCo-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/DeCo-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "golden retriever" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
File size: 3,977 Bytes
9dc3cb9 23c5090 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 | ---
library_name: diffusers
pipeline_tag: unconditional-image-generation
tags:
- diffusers
- deco
- image-generation
- class-conditional
- imagenet
license: mit
inference: true
widget:
- text: golden retriever
output:
url: DeCo-XL-16-512/demo.png
language:
- en
---
# DeCo-diffusers
Diffusers-ready checkpoints for **DeCo** (Decoupled Conditioning), converted for local/offline use.
This root folder is a model collection that contains:
- `DeCo-XL-16-256`
- `DeCo-XL-16-512`
- `DeCo-XXL-16-512-t2i` (text-to-image; requires `Qwen/Qwen3-1.7B` text encoder)
Each subfolder is a self-contained Diffusers model repo with:
- `pipeline.py`
- `transformer/transformer_deco.py`
- `scheduler/scheduling_deco_flow_match_euler_discrete.py`
- `transformer/diffusion_pytorch_model.safetensors`
- `vae/autoencoder_deco.py`
Each variant embeds English `id2label` directly in `model_index.json` (DiT-style), so class labels can be passed as
ImageNet ids or English synonym strings.
- `pipe.id2label` — id → English label (comma-separated synonyms)
- `pipe.get_label_ids("golden retriever")` — English label → id
## Demo

Class-conditional sample (ImageNet class **207**, golden retriever), `DeCo-XL/16` at 512×512, 100 steps, CFG 5.0, seed 42.
## Model Paths
Use paths relative to this root README:
| Model | Resolution | Source checkpoint | Local path |
| --- | ---: | --- | --- |
| DeCo-XL/16 | 256×256 | `imagenet256_epoch800.ckpt` (EMA) | `./DeCo-XL-16-256` |
| DeCo-XL/16 | 512×512 | `imagenet512_epoch340.ckpt` (EMA) | `./DeCo-XL-16-512` |
| DeCo-XXL/16 | 512×512 t2i | `t2i_DeCo.ckpt` (EMA) | `./DeCo-XXL-16-512-t2i` |
## Inference Demo (Diffusers)
### 1) Load a local subfolder checkpoint
```python
import torch
from diffusers import DiffusionPipeline
model_path = "./DeCo-XL-16-512" # change to ./DeCo-XL-16-256 for 256px
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe = DiffusionPipeline.from_pretrained(
model_path,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to(device)
generator = torch.Generator(device=device).manual_seed(42)
# ImageNet class example: 207 = golden retriever
print(pipe.id2label[207])
print(pipe.get_label_ids("golden retriever")) # [207]
result = pipe(
class_labels="golden retriever",
num_inference_steps=100,
guidance_scale=5.0, # use 3.2 for DeCo-XL-16-256
generator=generator,
)
image = result.images[0]
image.save("deco_xl_512_demo.png")
```
### 2) Quick variant switch (256 model)
```python
model_path = "./DeCo-XL-16-256"
pipe = DiffusionPipeline.from_pretrained(model_path, trust_remote_code=True).to(device)
image = pipe(
class_labels=207,
num_inference_steps=100,
guidance_scale=3.2,
generator=generator,
).images[0]
image.save("deco_xl_256_demo.png")
```
Integer class ids, batched labels, and optional `batch_size` for repeating a single label are also supported.
### 3) Text-to-image (`DeCo-XXL-16-512-t2i` / `t2i_DeCo.ckpt`)
Use the **AdamLM** scheduler defaults from official DeCo (not the c2i 100-step / CFG 5.0 recipe):
```python
import torch
from diffusers import DiffusionPipeline
model_path = "./DeCo-XXL-16-512-t2i"
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe = DiffusionPipeline.from_pretrained(
model_path,
trust_remote_code=True,
custom_pipeline=f"{model_path}/pipeline.py",
torch_dtype=torch.bfloat16,
).to(device)
# Bundled ./text_encoder (Qwen3-1.7B weights + tokenizer). Pipeline loads both from that folder.
# Denoiser runs in float32 during __call__ (matches official GenEval predict).
image = pipe(
prompt="a golden retriever playing in the snow, high quality photograph",
negative_prompt="Unrealistic, JPEG artifacts.",
num_inference_steps=25,
guidance_scale=4.0,
timeshift=3.0,
generator=torch.Generator(device="cpu").manual_seed(42),
).images[0]
image.save("deco_t2i_demo.png")
```
|