LWD — Learning When to Denoise

EMA weights for "Learning When to Denoise: Optimizing Asynchronous Schedules for Latent Diffusion."

📄 Paper: https://arxiv.org/abs/2606.19662
💻 Code: https://github.com/bsq532087/LWD

These are the EMA weights of the LightningDiT-XL/1 (675M-parameter) denoiser trained with our learned asynchronous semantic–texture schedule on class-conditional ImageNet 256×256.

Checkpoints

File	Training budget	Unguided FID	AutoGuidance FID
`xl_400k.pt`	400K iter (≈80 epochs)	2.87	1.14
`xl_1m.pt`	1M iter (≈200 epochs)	2.37	1.05
`xl_3m.pt`	3M iter (≈600 epochs)	2.14	1.02

Each file is a slim checkpoint of the form {'ema': state_dict} and is drop-in for the inference script in the code repository.

Usage

from huggingface_hub import hf_hub_download

ckpt_path = hf_hub_download("bsq532087/LWD", "xl_3m.pt")
# then point the code repo's inference config / --ckpt at `ckpt_path`

The texture latent decoder (SD-VAE f16-d32) and the SemVAE semantic encoder are inherited from SFD / LightningDiT; see the code repository for how to obtain them.

License & attribution

Released under the MIT License. The denoiser backbone derives from LightningDiT and the semantic-first latent setup / SemVAE encoder from SFD; please also respect the licenses of those projects.

Citation

@article{qian2026learning,
  title   = {Learning When to Denoise: Optimizing Asynchronous Schedules for Latent Diffusion},
  author  = {Qian, Bingshuo and Cheng, Xiang},
  journal = {arXiv preprint arXiv:2606.19662},
  year    = {2026},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for bsq532087/LWD

Learning When to Denoise: Optimizing Asynchronous Schedules for Latent Diffusion

Paper • 2606.19662 • Published 5 days ago