Distribution Matching Distillation
Collection
2 items • Updated • 1
The training method of this distillation model follows Alibaba's Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield, which decouples the DMD Loss into a CFG Augmentation Loss under two independent time steps and a new Distribution Matching Loss as regularization.
Experiments were conducted based on DMD2 Codebase.
It also uses the backward simulation specially designed in the DMD2 on SDv1.5 (the original code was only for SDXL).
Training with the following parameters:
--real_guidance_scale 3.0 \
--fake_guidance_scale 1.0 \
--max_grad_norm 10.0 \
--use_fp16 \
--log_loss \
--dfake_gen_update_ratio 5 \
--fsdp \
--denoising \
--num_denoising_step 4 \
--denoising_timestep 1000 \
--backward_simulation \
--use_decoupled_dmd \
--min_step_percent 0.0 \
--max_step_percent 1.0 \
The checkpoint obtained at 2000 steps (with 2000 generator updates, 10000 guidance updates, and real_guidance_scale = 3.0) got a CLIP score of 0.325.
Base model
stable-diffusion-v1-5/stable-diffusion-v1-5