File size: 3,213 Bytes
d717924
 
 
 
 
 
 
 
 
 
 
 
9356fa9
d717924
 
 
 
9356fa9
d717924
05d8082
 
 
 
 
 
d717924
9356fa9
d717924
9356fa9
d717924
7fc7e34
9356fa9
 
 
d717924
9356fa9
d717924
9b59af7
 
 
 
 
 
 
 
 
 
 
9356fa9
d717924
9356fa9
d717924
7fc7e34
d717924
 
 
 
7fc7e34
 
 
 
 
 
 
02a21e2
7fc7e34
 
 
 
 
 
 
 
 
 
 
 
 
 
9356fa9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
license: mit
library_name: diffusers
pipeline_tag: text-to-image
tags:
- diffusers
- adm
- adm-g
- image-generation
- class-conditional
widget:
- output:
    url: ADM-G-512/demo.png
language:
- en
---

# BiliSakura/ADM-diffusers

Self-contained OpenAI ADM-G checkpoints for Hugging Face diffusers. **No external code repo is required** β€” each subfolder ships its own `pipeline.py`, component modules, and weights.

This repo is derived from the development bundle in [Visual-Generative-Foundation-Model-Collection](https://github.com/Bili-Sakura/Visual-Generative-Foundation-Model-Collection), but inference only needs:

- This model repo (`BiliSakura/ADM-diffusers`)
- PyPI `diffusers`, `torch`, `huggingface_hub`

This Hugging Face repo hosts **multiple self-contained checkpoints as subfolders**. Each subfolder includes its own `pipeline.py`, `model_index.json`, weights, and component code (`unet/`, `classifier/`, `scheduler/`).

## Available checkpoints

| Subfolder | Resolution | Guidance scale | OpenAI sources |
| --- | --- | ---: | --- |
| [`ADM-G-256/`](ADM-G-256/) | 256Γ—256 | 1.0 | `256x256_diffusion.pt` + `256x256_classifier.pt` |
| [`ADM-G-512/`](ADM-G-512/) | 512Γ—512 | 4.0 | `512x512_diffusion.pt` + `512x512_classifier.pt` |

Both resolutions use the **class-conditional** diffusion checkpoint plus the noisy classifier (not the 256 uncond variant).

## ImageNet class labels

Each variant keeps an `id2label` map directly in its own `model_index.json` (same style as DiT on the Hub). Runtime label resolution is English-only:

- `pipe.id2label` β€” inspect id β†’ English label correspondence
- `pipe.labels` β€” reverse map (English synonym β†’ id), sorted for browsing
- `pipe.get_label_ids("golden retriever")`
- `pipe(class_labels="golden retriever", ...)`

Chinese labels are still preserved in the main source repo under `src/labels/id2label_cn.json` for reference.

## Demo

![ADM-G-512 demo](ADM-G-512/demo.png)

Settings used for this demo image: `ADM-G-512`, `DDIMScheduler`, `num_inference_steps=50`, `guidance_scale=4.0`, `seed=42`, class `"golden retriever"`.

```python
from pathlib import Path
import torch
from diffusers import DDIMScheduler, DiffusionPipeline

model_dir = Path("./BiliSakura/ADM-diffusers/ADM-G-512")
pipe = DiffusionPipeline.from_pretrained(
    str(model_dir),
    local_files_only=True,
    custom_pipeline=str(model_dir / "pipeline.py"),
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
)
pipe = pipe.to("cuda")
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
class_id = pipe.get_label_ids("golden retriever")[0]
generator = torch.Generator(device="cuda").manual_seed(42)

out = pipe(
    class_labels=class_id,
    guidance_scale=4.0,
    num_inference_steps=50,
    generator=generator,
).images[0]
out

```

## Repo layout

```text
BiliSakura/ADM-diffusers/
β”œβ”€β”€ README.md
β”œβ”€β”€ ADM-G-256/
β”‚   β”œβ”€β”€ pipeline.py
β”‚   β”œβ”€β”€ model_index.json
β”‚   β”œβ”€β”€ unet/
β”‚   β”œβ”€β”€ classifier/
β”‚   └── scheduler/
└── ADM-G-512/
    β”œβ”€β”€ pipeline.py
    β”œβ”€β”€ model_index.json
    β”œβ”€β”€ demo.png
    β”œβ”€β”€ unet/
    β”œβ”€β”€ classifier/
    └── scheduler/
```