File size: 2,261 Bytes
ed2880b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
library_name: diffusers
pipeline_tag: unconditional-image-generation
tags:
  - diffusers
  - lightningdit
  - image-generation
  - class-conditional
  - imagenet
  - flow-matching
license: mit
inference: true
widget:
  - output:
      url: LightningDit-XL-1-256/demo.png
language:
  - en
---

# LightningDiT-diffusers

Diffusers-ready checkpoints for **LightningDiT** (VA-VAE–aligned latent diffusion with flow matching), converted from [`hustvl/lightningdit-xl-imagenet256-800ep`](https://huggingface.co/hustvl/lightningdit-xl-imagenet256-800ep) for local/offline use.

This root folder is a model collection that contains:

- `LightningDit-XL-1-256`

Each subfolder is a self-contained Diffusers model repo with:

- `pipeline.py` (`LightningDiTPipeline`)
- `transformer/transformer_lightningdit.py` and weights
- `scheduler/scheduler_config.json` (`FlowMatchHeunDiscreteScheduler`, `shift=0.3`)
- `vae/` ([`REPA-E/vavae-hf`](https://huggingface.co/REPA-E/vavae-hf))

Each variant embeds English `id2label` in `model_index.json`, so class labels can be passed as ImageNet ids or English synonym strings.

## Demo

![LightningDiT-XL-1-256 demo](LightningDit-XL-1-256/demo.png)

Class-conditional sample (ImageNet class **207**, golden retriever), `LightningDiT-XL/1` at 256×256, 250 steps, CFG 6.7, `cfg_interval_start=0.125`, `timestep_shift=0.3`, seed 0.

## Model Paths

| Model | Resolution | Local path |
| --- | ---: | --- |
| LightningDiT-XL/1 | 256×256 | `./LightningDit-XL-1-256` |

## Inference

```python
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "./LightningDit-XL-1-256",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")

class_id = pipe.get_label_ids("golden retriever")[0]
image = pipe(
    class_labels=class_id,
    num_inference_steps=250,
    guidance_scale=6.7,
    cfg_interval_start=0.125,
    timestep_shift=0.3,
    generator=torch.Generator(device="cuda").manual_seed(0),
).images[0]
```

## Citation

```bibtex
@inproceedings{yao2025reconstruction,
  title={Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models},
  author={Yao, Jingfeng and Yang, Bin and Wang, Xinggang},
  booktitle={CVPR},
  year={2025}
}
```