EDM2-diffusers / README.md
BiliSakura's picture
Upload folder using huggingface_hub
3b91ebd verified
---
license: cc-by-nc-sa-4.0
library_name: diffusers
pipeline_tag: unconditional-image-generation
tags:
- diffusers
- edm2
- image-generation
- class-conditional
- imagenet
inference: true
widget:
- output:
url: edm2-img512-xxl-fid/demo.png
language:
- en
---
# EDM2-diffusers
Diffusers-ready checkpoints for **EDM2** ([Analyzing and Improving the Training Dynamics of Diffusion Models](https://arxiv.org/abs/2312.02696)),
converted from [NVlabs/edm2](https://github.com/NVlabs/edm2) post-hoc reconstructions.
Official source weights: `https://nvlabs-fi-cdn.nvidia.com/edm2/posthoc-reconstructions/`
This root folder is a model collection that contains:
- `edm2-img512-xs-fid`
- `edm2-img512-s-fid`
- `edm2-img512-m-fid`
- `edm2-img512-l-fid`
- `edm2-img512-l-dino`
- `edm2-img512-xl-fid`
- `edm2-img512-xxl-fid`
Each subfolder is a self-contained Diffusers model repo with:
- `pipeline.py`
- `unet/unet_edm2.py`
- `scheduler/scheduler_config.json` (`EDMEulerScheduler`)
- `unet/diffusion_pytorch_model.safetensors`
- `vae/diffusion_pytorch_model.safetensors`
## Demo
![edm2-img512-xxl-fid demo](edm2-img512-xxl-fid/demo.png)
Class-conditional sample (ImageNet class **207**, golden retriever), EDM2-XXL at 512×512, 32 steps, guidance 1.0, seed 42.
## Model Paths
Use paths relative to this root README:
| Model | NVlabs preset | FID | Local path |
| --- | --- | ---: | --- |
| EDM2-XS | `edm2-img512-xs-fid` | 3.53 | `./edm2-img512-xs-fid` |
| EDM2-S | `edm2-img512-s-fid` | 2.56 | `./edm2-img512-s-fid` |
| EDM2-M | `edm2-img512-m-fid` | 2.25 | `./edm2-img512-m-fid` |
| EDM2-L | `edm2-img512-l-fid` | 2.06 | `./edm2-img512-l-fid` |
| EDM2-L (DINO) | `edm2-img512-l-dino` | — | `./edm2-img512-l-dino` |
| EDM2-XL | `edm2-img512-xl-fid` | 1.96 | `./edm2-img512-xl-fid` |
| EDM2-XXL | `edm2-img512-xxl-fid` | 1.91 | `./edm2-img512-xxl-fid` |
## Inference Demo (Diffusers)
### 1) Load a local subfolder checkpoint
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./edm2-img512-xxl-fid") # change to any path in the table above
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda")
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
class_labels=207, # golden retriever (ImageNet id); omit for random class
num_inference_steps=32,
guidance_scale=1.0, # >1.0 requires a gnet/ checkpoint
generator=generator,
).images[0]
image.save("demo.png")
```
Official inference defaults (`generate_images.py`): `num_steps=32`, `sigma_min=0.002`,
`sigma_max=80`, `rho=7`, `guidance=1.0` (no gnet), `S_churn=0`. Heun sampling runs in
float32 internally even when UNet/VAE weights are loaded in bf16/fp16.
Guided presets require a converted `gnet/` folder and `guidance_scale` matching the
NVlabs preset.
### 2) Convert a legacy `.pkl`
```bash
python scripts/convert_edm2_to_diffusers.py \
--checkpoint models/BiliSakura/EDM2-diffusers/edm2-img512-xs-2147483-0.135.pkl \
--output models/BiliSakura/EDM2-diffusers
```
Creates `edm2-img512-xs-fid/` automatically from the NVlabs preset mapping.
## Checkpoint preset mapping
Maps NVlabs `--preset=...` names from [`generate_images.py`](https://github.com/NVlabs/edm2/blob/main/generate_images.py)
to source pickle filenames and local Diffusers directories.
### EDM2 paper — ImageNet-512 (conditional)
| NVlabs preset | Source `.pkl` (net) | Diffusers dir | Metric |
| --- | --- | --- | --- |
| `edm2-img512-xs-fid` | `edm2-img512-xs-2147483-0.135.pkl` | `edm2-img512-xs-fid/` | FID 3.53 |
| `edm2-img512-xs-dino` | `edm2-img512-xs-2147483-0.200.pkl` | — | FD<sub>DINOv2</sub> 103.39 |
| `edm2-img512-s-fid` | `edm2-img512-s-2147483-0.130.pkl` | `edm2-img512-s-fid/` | FID 2.56 |
| `edm2-img512-s-dino` | `edm2-img512-s-2147483-0.190.pkl` | — | FD<sub>DINOv2</sub> 68.64 |
| `edm2-img512-m-fid` | `edm2-img512-m-2147483-0.100.pkl` | `edm2-img512-m-fid/` | FID 2.25 |
| `edm2-img512-m-dino` | `edm2-img512-m-2147483-0.155.pkl` | — | FD<sub>DINOv2</sub> 58.44 |
| `edm2-img512-l-fid` | `edm2-img512-l-1879048-0.085.pkl` | `edm2-img512-l-fid/` | FID 2.06 |
| `edm2-img512-l-dino` | `edm2-img512-l-1879048-0.155.pkl` | `edm2-img512-l-dino/` | FD<sub>DINOv2</sub> 52.25 |
| `edm2-img512-xl-fid` | `edm2-img512-xl-1342177-0.085.pkl` | `edm2-img512-xl-fid/` | FID 1.96 |
| `edm2-img512-xl-dino` | `edm2-img512-xl-1342177-0.155.pkl` | — | FD<sub>DINOv2</sub> 45.96 |
| `edm2-img512-xxl-fid` | `edm2-img512-xxl-0939524-0.070.pkl` | `edm2-img512-xxl-fid/` | FID 1.91 |
| `edm2-img512-xxl-dino` | `edm2-img512-xxl-0939524-0.150.pkl` | — | FD<sub>DINOv2</sub> 42.84 |
### EDM2 paper — ImageNet-64 (conditional)
| NVlabs preset | Source `.pkl` (net) | Metric |
| --- | --- | --- |
| `edm2-img64-s-fid` | `edm2-img64-s-1073741-0.075.pkl` | FID 1.58 |
| `edm2-img64-m-fid` | `edm2-img64-m-2147483-0.060.pkl` | FID 1.43 |
| `edm2-img64-l-fid` | `edm2-img64-l-1073741-0.040.pkl` | FID 1.33 |
| `edm2-img64-xl-fid` | `edm2-img64-xl-0671088-0.040.pkl` | FID 1.33 |
### EDM2 paper — classifier-free guidance (ImageNet-512)
Use `guidance_scale` below and include the converted `gnet/` checkpoint.
| NVlabs preset | Source `.pkl` (net) | Source `.pkl` (gnet) | Guidance | Metric |
| --- | --- | --- | ---: | --- |
| `edm2-img512-xs-guid-fid` | `edm2-img512-xs-2147483-0.045.pkl` | `edm2-img512-xs-uncond-2147483-0.045.pkl` | 1.40 | FID 2.91 |
| `edm2-img512-xs-guid-dino` | `edm2-img512-xs-2147483-0.150.pkl` | `edm2-img512-xs-uncond-2147483-0.150.pkl` | 1.70 | FD<sub>DINOv2</sub> 79.94 |
| `edm2-img512-s-guid-fid` | `edm2-img512-s-2147483-0.025.pkl` | `edm2-img512-xs-uncond-2147483-0.025.pkl` | 1.40 | FID 2.23 |
| `edm2-img512-s-guid-dino` | `edm2-img512-s-2147483-0.085.pkl` | `edm2-img512-xs-uncond-2147483-0.085.pkl` | 1.90 | FD<sub>DINOv2</sub> 52.32 |
| `edm2-img512-m-guid-fid` | `edm2-img512-m-2147483-0.030.pkl` | `edm2-img512-xs-uncond-2147483-0.030.pkl` | 1.20 | FID 2.01 |
| `edm2-img512-m-guid-dino` | `edm2-img512-m-2147483-0.015.pkl` | `edm2-img512-xs-uncond-2147483-0.015.pkl` | 2.00 | FD<sub>DINOv2</sub> 41.98 |
| `edm2-img512-l-guid-fid` | `edm2-img512-l-1879048-0.015.pkl` | `edm2-img512-xs-uncond-2147483-0.015.pkl` | 1.20 | FID 1.88 |
| `edm2-img512-l-guid-dino` | `edm2-img512-l-1879048-0.035.pkl` | `edm2-img512-xs-uncond-2147483-0.035.pkl` | 1.70 | FD<sub>DINOv2</sub> 38.20 |
| `edm2-img512-xl-guid-fid` | `edm2-img512-xl-1342177-0.020.pkl` | `edm2-img512-xs-uncond-2147483-0.020.pkl` | 1.20 | FID 1.85 |
| `edm2-img512-xl-guid-dino` | `edm2-img512-xl-1342177-0.030.pkl` | `edm2-img512-xs-uncond-2147483-0.030.pkl` | 1.70 | FD<sub>DINOv2</sub> 35.67 |
| `edm2-img512-xxl-guid-fid` | `edm2-img512-xxl-0939524-0.015.pkl` | `edm2-img512-xs-uncond-2147483-0.015.pkl` | 1.20 | FID 1.81 |
| `edm2-img512-xxl-guid-dino` | `edm2-img512-xxl-0939524-0.015.pkl` | `edm2-img512-xs-uncond-2147483-0.015.pkl` | 1.70 | FD<sub>DINOv2</sub> 33.09 |
### Autoguidance paper
| NVlabs preset | Source `.pkl` (net) | Source `.pkl` (gnet) | Guidance | Metric |
| --- | --- | --- | ---: | --- |
| `edm2-img512-s-autog-fid` | `edm2-img512-s-2147483-0.070.pkl` | `edm2-img512-xs-0134217-0.125.pkl` | 2.10 | FID 1.34 |
| `edm2-img512-s-autog-dino` | `edm2-img512-s-2147483-0.120.pkl` | `edm2-img512-xs-0134217-0.165.pkl` | 2.45 | FD<sub>DINOv2</sub> 36.67 |
| `edm2-img512-xxl-autog-fid` | `edm2-img512-xxl-0939524-0.075.pkl` | `edm2-img512-m-0268435-0.155.pkl` | 2.05 | FID 1.25 |
| `edm2-img512-xxl-autog-dino` | `edm2-img512-xxl-0939524-0.130.pkl` | `edm2-img512-m-0268435-0.205.pkl` | 2.30 | FD<sub>DINOv2</sub> 24.18 |
| `edm2-img512-s-uncond-autog-fid` | `edm2-img512-s-uncond-2147483-0.070.pkl` | `edm2-img512-xs-uncond-0134217-0.110.pkl` | 2.85 | FID 3.86 |
| `edm2-img512-s-uncond-autog-dino` | `edm2-img512-s-uncond-2147483-0.090.pkl` | `edm2-img512-xs-uncond-0134217-0.125.pkl` | 2.90 | FD<sub>DINOv2</sub> 90.39 |
| `edm2-img64-s-autog-fid` | `edm2-img64-s-1073741-0.045.pkl` | `edm2-img64-xs-0134217-0.110.pkl` | 1.70 | FID 1.01 |
| `edm2-img64-s-autog-dino` | `edm2-img64-s-1073741-0.105.pkl` | `edm2-img64-xs-0134217-0.175.pkl` | 2.20 | FD<sub>DINOv2</sub> 31.85 |
### NVlabs preset shorthand
```text
# EDM2 paper
edm2-img512-{xs|s|m|l|xl|xxl}-{fid|dino}
edm2-img64-{s|m|l|xl}-fid
edm2-img512-{xs|s|m|l|xl|xxl}-guid-{fid|dino}
# Autoguidance paper
edm2-img512-{s|xxl}-autog-{fid|dino}
edm2-img512-s-uncond-autog-{fid|dino}
edm2-img64-s-autog-{fid|dino}
```
Example NVlabs command:
```bash
python generate_images.py --preset=edm2-img512-s-guid-dino --outdir=out
```
Equivalent expanded form:
```bash
python generate_images.py \
--net=https://nvlabs-fi-cdn.nvidia.com/edm2/posthoc-reconstructions/edm2-img512-s-2147483-0.085.pkl \
--gnet=https://nvlabs-fi-cdn.nvidia.com/edm2/posthoc-reconstructions/edm2-img512-xs-uncond-2147483-0.085.pkl \
--guidance=1.9 \
--outdir=out
```