File size: 3,038 Bytes
330be2a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
943b32f
330be2a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
license: cc-by-nc-4.0
library_name: diffusers
pipeline_tag: text-to-image
tags:
- diffusers
- dit
- image-generation
- class-conditional
- imagenet
widget:
- output:
    url: DiT-XL-2-512/demo.png
language:
- en
---

# BiliSakura/DiT-diffusers

Diffusers-ready checkpoints for **Diffusion Transformers (DiT)**, re-packaged for local/offline use with a project-owned custom `DiTPipeline`.

> **Re-distribution notice:** weights and configs in this repo are re-distributed from [`facebook/DiT-XL-2-512`](https://huggingface.co/facebook/DiT-XL-2-512). Original work: [Scalable Diffusion Models with Transformers (ICCV 2023)](https://openaccess.thecvf.com/content/ICCV2023/html/Peebles_Scalable_Diffusion_Models_with_Transformers_ICCV_2023_paper.html). License: [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).

This repo is derived from the development bundle in [Visual-Generative-Foundation-Model-Collection](https://github.com/Bili-Sakura/Visual-Generative-Foundation-Model-Collection). Inference only needs:

- This model repo (`BiliSakura/DiT-diffusers`)
- PyPI `diffusers`, `torch`, `safetensors`

## Important note

This repo intentionally does **not** use Diffusers built-in `diffusers.DiTPipeline`.
Instead, each model subfolder contains `pipeline.py` with a custom class named `DiTPipeline`.

## Available checkpoints

| Subfolder | Resolution | Source |
| --- | --- | --- |
| [`DiT-XL-2-256/`](DiT-XL-2-256/) | 256Γ—256 | [`facebook/DiT-XL-2-256`](https://huggingface.co/facebook/DiT-XL-2-256) |
| [`DiT-XL-2-512/`](DiT-XL-2-512/) | 512Γ—512 | [`facebook/DiT-XL-2-512`](https://huggingface.co/facebook/DiT-XL-2-512) |

Each subfolder is a self-contained Diffusers model repo with:

- `model_index.json` (includes ImageNet `id2label`)
- `pipeline.py` (custom `DiTPipeline`)
- `transformer/diffusion_pytorch_model.safetensors`
- `vae/diffusion_pytorch_model.safetensors`
- `scheduler/scheduler_config.json`

## Demo

![DiT-XL-2-512 demo](DiT-XL-2-512/demo.png)

```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline

model_dir = Path("path/to/DiT-XL-2-512")
pipe = DiffusionPipeline.from_pretrained(
    str(model_dir),
    local_files_only=True,
    custom_pipeline=str(model_dir / "pipeline.py"),
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")
generator = torch.Generator(device="cuda").manual_seed(0)
out = pipe(
    class_labels=[207],
    num_inference_steps=250,
    guidance_scale=4.0,
    generator=generator,
).images[0]
out
```

## Repo layout

```text
BiliSakura/DiT-diffusers/
β”œβ”€β”€ README.md
β”œβ”€β”€ DiT-XL-2-256/
└── DiT-XL-2-512/
    β”œβ”€β”€ README.md
    β”œβ”€β”€ model_index.json
    β”œβ”€β”€ pipeline.py
    β”œβ”€β”€ demo.png
    β”œβ”€β”€ transformer/
    β”‚   β”œβ”€β”€ config.json
    β”‚   └── diffusion_pytorch_model.safetensors
    β”œβ”€β”€ vae/
    β”‚   β”œβ”€β”€ config.json
    β”‚   └── diffusion_pytorch_model.safetensors
    └── scheduler/
        └── scheduler_config.json
```