File size: 9,010 Bytes
3b91ebd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
---
license: cc-by-nc-sa-4.0
library_name: diffusers
pipeline_tag: unconditional-image-generation
tags:
  - diffusers
  - edm2
  - image-generation
  - class-conditional
  - imagenet
inference: true
widget:
  - output:
      url: edm2-img512-xxl-fid/demo.png
language:
  - en
---

# EDM2-diffusers

Diffusers-ready checkpoints for **EDM2** ([Analyzing and Improving the Training Dynamics of Diffusion Models](https://arxiv.org/abs/2312.02696)),
converted from [NVlabs/edm2](https://github.com/NVlabs/edm2) post-hoc reconstructions.

Official source weights: `https://nvlabs-fi-cdn.nvidia.com/edm2/posthoc-reconstructions/`

This root folder is a model collection that contains:

- `edm2-img512-xs-fid`
- `edm2-img512-s-fid`
- `edm2-img512-m-fid`
- `edm2-img512-l-fid`
- `edm2-img512-l-dino`
- `edm2-img512-xl-fid`
- `edm2-img512-xxl-fid`

Each subfolder is a self-contained Diffusers model repo with:

- `pipeline.py`
- `unet/unet_edm2.py`
- `scheduler/scheduler_config.json` (`EDMEulerScheduler`)
- `unet/diffusion_pytorch_model.safetensors`
- `vae/diffusion_pytorch_model.safetensors`

## Demo

![edm2-img512-xxl-fid demo](edm2-img512-xxl-fid/demo.png)

Class-conditional sample (ImageNet class **207**, golden retriever), EDM2-XXL at 512Γ—512, 32 steps, guidance 1.0, seed 42.

## Model Paths

Use paths relative to this root README:

| Model | NVlabs preset | FID | Local path |
| --- | --- | ---: | --- |
| EDM2-XS | `edm2-img512-xs-fid` | 3.53 | `./edm2-img512-xs-fid` |
| EDM2-S | `edm2-img512-s-fid` | 2.56 | `./edm2-img512-s-fid` |
| EDM2-M | `edm2-img512-m-fid` | 2.25 | `./edm2-img512-m-fid` |
| EDM2-L | `edm2-img512-l-fid` | 2.06 | `./edm2-img512-l-fid` |
| EDM2-L (DINO) | `edm2-img512-l-dino` | β€” | `./edm2-img512-l-dino` |
| EDM2-XL | `edm2-img512-xl-fid` | 1.96 | `./edm2-img512-xl-fid` |
| EDM2-XXL | `edm2-img512-xxl-fid` | 1.91 | `./edm2-img512-xxl-fid` |

## Inference Demo (Diffusers)

### 1) Load a local subfolder checkpoint

```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline

model_dir = Path("./edm2-img512-xxl-fid")  # change to any path in the table above
pipe = DiffusionPipeline.from_pretrained(
    str(model_dir),
    local_files_only=True,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")

generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
    class_labels=207,          # golden retriever (ImageNet id); omit for random class
    num_inference_steps=32,
    guidance_scale=1.0,        # >1.0 requires a gnet/ checkpoint
    generator=generator,
).images[0]
image.save("demo.png")
```

Official inference defaults (`generate_images.py`): `num_steps=32`, `sigma_min=0.002`,
`sigma_max=80`, `rho=7`, `guidance=1.0` (no gnet), `S_churn=0`. Heun sampling runs in
float32 internally even when UNet/VAE weights are loaded in bf16/fp16.

Guided presets require a converted `gnet/` folder and `guidance_scale` matching the
NVlabs preset.

### 2) Convert a legacy `.pkl`

```bash
python scripts/convert_edm2_to_diffusers.py \
  --checkpoint models/BiliSakura/EDM2-diffusers/edm2-img512-xs-2147483-0.135.pkl \
  --output models/BiliSakura/EDM2-diffusers
```

Creates `edm2-img512-xs-fid/` automatically from the NVlabs preset mapping.

## Checkpoint preset mapping

Maps NVlabs `--preset=...` names from [`generate_images.py`](https://github.com/NVlabs/edm2/blob/main/generate_images.py)
to source pickle filenames and local Diffusers directories.

### EDM2 paper β€” ImageNet-512 (conditional)

| NVlabs preset | Source `.pkl` (net) | Diffusers dir | Metric |
| --- | --- | --- | --- |
| `edm2-img512-xs-fid` | `edm2-img512-xs-2147483-0.135.pkl` | `edm2-img512-xs-fid/` | FID 3.53 |
| `edm2-img512-xs-dino` | `edm2-img512-xs-2147483-0.200.pkl` | β€” | FD<sub>DINOv2</sub> 103.39 |
| `edm2-img512-s-fid` | `edm2-img512-s-2147483-0.130.pkl` | `edm2-img512-s-fid/` | FID 2.56 |
| `edm2-img512-s-dino` | `edm2-img512-s-2147483-0.190.pkl` | β€” | FD<sub>DINOv2</sub> 68.64 |
| `edm2-img512-m-fid` | `edm2-img512-m-2147483-0.100.pkl` | `edm2-img512-m-fid/` | FID 2.25 |
| `edm2-img512-m-dino` | `edm2-img512-m-2147483-0.155.pkl` | β€” | FD<sub>DINOv2</sub> 58.44 |
| `edm2-img512-l-fid` | `edm2-img512-l-1879048-0.085.pkl` | `edm2-img512-l-fid/` | FID 2.06 |
| `edm2-img512-l-dino` | `edm2-img512-l-1879048-0.155.pkl` | `edm2-img512-l-dino/` | FD<sub>DINOv2</sub> 52.25 |
| `edm2-img512-xl-fid` | `edm2-img512-xl-1342177-0.085.pkl` | `edm2-img512-xl-fid/` | FID 1.96 |
| `edm2-img512-xl-dino` | `edm2-img512-xl-1342177-0.155.pkl` | β€” | FD<sub>DINOv2</sub> 45.96 |
| `edm2-img512-xxl-fid` | `edm2-img512-xxl-0939524-0.070.pkl` | `edm2-img512-xxl-fid/` | FID 1.91 |
| `edm2-img512-xxl-dino` | `edm2-img512-xxl-0939524-0.150.pkl` | β€” | FD<sub>DINOv2</sub> 42.84 |

### EDM2 paper β€” ImageNet-64 (conditional)

| NVlabs preset | Source `.pkl` (net) | Metric |
| --- | --- | --- |
| `edm2-img64-s-fid` | `edm2-img64-s-1073741-0.075.pkl` | FID 1.58 |
| `edm2-img64-m-fid` | `edm2-img64-m-2147483-0.060.pkl` | FID 1.43 |
| `edm2-img64-l-fid` | `edm2-img64-l-1073741-0.040.pkl` | FID 1.33 |
| `edm2-img64-xl-fid` | `edm2-img64-xl-0671088-0.040.pkl` | FID 1.33 |

### EDM2 paper β€” classifier-free guidance (ImageNet-512)

Use `guidance_scale` below and include the converted `gnet/` checkpoint.

| NVlabs preset | Source `.pkl` (net) | Source `.pkl` (gnet) | Guidance | Metric |
| --- | --- | --- | ---: | --- |
| `edm2-img512-xs-guid-fid` | `edm2-img512-xs-2147483-0.045.pkl` | `edm2-img512-xs-uncond-2147483-0.045.pkl` | 1.40 | FID 2.91 |
| `edm2-img512-xs-guid-dino` | `edm2-img512-xs-2147483-0.150.pkl` | `edm2-img512-xs-uncond-2147483-0.150.pkl` | 1.70 | FD<sub>DINOv2</sub> 79.94 |
| `edm2-img512-s-guid-fid` | `edm2-img512-s-2147483-0.025.pkl` | `edm2-img512-xs-uncond-2147483-0.025.pkl` | 1.40 | FID 2.23 |
| `edm2-img512-s-guid-dino` | `edm2-img512-s-2147483-0.085.pkl` | `edm2-img512-xs-uncond-2147483-0.085.pkl` | 1.90 | FD<sub>DINOv2</sub> 52.32 |
| `edm2-img512-m-guid-fid` | `edm2-img512-m-2147483-0.030.pkl` | `edm2-img512-xs-uncond-2147483-0.030.pkl` | 1.20 | FID 2.01 |
| `edm2-img512-m-guid-dino` | `edm2-img512-m-2147483-0.015.pkl` | `edm2-img512-xs-uncond-2147483-0.015.pkl` | 2.00 | FD<sub>DINOv2</sub> 41.98 |
| `edm2-img512-l-guid-fid` | `edm2-img512-l-1879048-0.015.pkl` | `edm2-img512-xs-uncond-2147483-0.015.pkl` | 1.20 | FID 1.88 |
| `edm2-img512-l-guid-dino` | `edm2-img512-l-1879048-0.035.pkl` | `edm2-img512-xs-uncond-2147483-0.035.pkl` | 1.70 | FD<sub>DINOv2</sub> 38.20 |
| `edm2-img512-xl-guid-fid` | `edm2-img512-xl-1342177-0.020.pkl` | `edm2-img512-xs-uncond-2147483-0.020.pkl` | 1.20 | FID 1.85 |
| `edm2-img512-xl-guid-dino` | `edm2-img512-xl-1342177-0.030.pkl` | `edm2-img512-xs-uncond-2147483-0.030.pkl` | 1.70 | FD<sub>DINOv2</sub> 35.67 |
| `edm2-img512-xxl-guid-fid` | `edm2-img512-xxl-0939524-0.015.pkl` | `edm2-img512-xs-uncond-2147483-0.015.pkl` | 1.20 | FID 1.81 |
| `edm2-img512-xxl-guid-dino` | `edm2-img512-xxl-0939524-0.015.pkl` | `edm2-img512-xs-uncond-2147483-0.015.pkl` | 1.70 | FD<sub>DINOv2</sub> 33.09 |

### Autoguidance paper

| NVlabs preset | Source `.pkl` (net) | Source `.pkl` (gnet) | Guidance | Metric |
| --- | --- | --- | ---: | --- |
| `edm2-img512-s-autog-fid` | `edm2-img512-s-2147483-0.070.pkl` | `edm2-img512-xs-0134217-0.125.pkl` | 2.10 | FID 1.34 |
| `edm2-img512-s-autog-dino` | `edm2-img512-s-2147483-0.120.pkl` | `edm2-img512-xs-0134217-0.165.pkl` | 2.45 | FD<sub>DINOv2</sub> 36.67 |
| `edm2-img512-xxl-autog-fid` | `edm2-img512-xxl-0939524-0.075.pkl` | `edm2-img512-m-0268435-0.155.pkl` | 2.05 | FID 1.25 |
| `edm2-img512-xxl-autog-dino` | `edm2-img512-xxl-0939524-0.130.pkl` | `edm2-img512-m-0268435-0.205.pkl` | 2.30 | FD<sub>DINOv2</sub> 24.18 |
| `edm2-img512-s-uncond-autog-fid` | `edm2-img512-s-uncond-2147483-0.070.pkl` | `edm2-img512-xs-uncond-0134217-0.110.pkl` | 2.85 | FID 3.86 |
| `edm2-img512-s-uncond-autog-dino` | `edm2-img512-s-uncond-2147483-0.090.pkl` | `edm2-img512-xs-uncond-0134217-0.125.pkl` | 2.90 | FD<sub>DINOv2</sub> 90.39 |
| `edm2-img64-s-autog-fid` | `edm2-img64-s-1073741-0.045.pkl` | `edm2-img64-xs-0134217-0.110.pkl` | 1.70 | FID 1.01 |
| `edm2-img64-s-autog-dino` | `edm2-img64-s-1073741-0.105.pkl` | `edm2-img64-xs-0134217-0.175.pkl` | 2.20 | FD<sub>DINOv2</sub> 31.85 |

### NVlabs preset shorthand

```text
# EDM2 paper
edm2-img512-{xs|s|m|l|xl|xxl}-{fid|dino}
edm2-img64-{s|m|l|xl}-fid
edm2-img512-{xs|s|m|l|xl|xxl}-guid-{fid|dino}

# Autoguidance paper
edm2-img512-{s|xxl}-autog-{fid|dino}
edm2-img512-s-uncond-autog-{fid|dino}
edm2-img64-s-autog-{fid|dino}
```

Example NVlabs command:

```bash
python generate_images.py --preset=edm2-img512-s-guid-dino --outdir=out
```

Equivalent expanded form:

```bash
python generate_images.py \
  --net=https://nvlabs-fi-cdn.nvidia.com/edm2/posthoc-reconstructions/edm2-img512-s-2147483-0.085.pkl \
  --gnet=https://nvlabs-fi-cdn.nvidia.com/edm2/posthoc-reconstructions/edm2-img512-xs-uncond-2147483-0.085.pkl \
  --guidance=1.9 \
  --outdir=out
```