lyraDiff: An Out-of-the-box Acceleration Engine for Diffusion and DiT Models
lyraDiff introduces a recompilation-free inference engine for Diffusion and DiT models, achieving state-of-the-art speed, extensive model support, and pixel-level image consistency.
Highlights
- State-of-the-art Inference Speed:
lyraDiffutilizes multiple techniques to achieve up to 6.1x speedup of the model inference, including Quantization, Fused GEMM Kernels, Flash Attention, and NHWC & Fused GroupNorm. - Memory Efficiency:
lyraDiffutilizes buffer-based DRAM reuse strategy and multiple types of quantizations (FP8/INT8/INT4) to save 10-40% of DRAM usage. - Extensive Model Support:
lyraDiffsupports a wide range of top Generative/SR models such as SD1.5, SDXL, FLUX, S3Diff, etc., and those most commonly used plugins such as LoRA, ControlNet and Ip-Adapter. - Zero Compilation Deployment: Unlike TensorRT or AITemplate, which takes minutes to compile,
lyraDiffeliminates runtime recompilation overhead even with model inputs of dynamic shapes. - Image Gen Consistency: The outputs of
lyraDiffare aligned with the ones of HF diffusers at the pixel level, even under LoRA switch in quantization mode. - Fast Plugin Hot-swap:
lyraDiffprovides Super Fast Model Hot-swap for ControlNet and LoRA which can hugely benefit a real-time image gen service.
Usage
lyraDiff-Flux.1-dev is converted from the standard FLUX.1-dev model weights using this script to be compatiable with lyraDiff, and contains both FP8 and FP16 version of converted Flux.1-dev
We provide a reference implementation of lyraDiff version of Flux.1-dev, as well as sampling code, in a dedicated github repository.
Citation
@Misc{lyraDiff_2025,
author = {Yibo Lu, Sa Xiao, Kangjian Wu, Bin Wu, Mian Peng, Haoxiong Su, Qiwen Mao, Wenjiang Zhou},
title = {lyraDiff: An Out-of-the-box Acceleration Engine for Diffusion and DiT Models},
howpublished = {\url{https://github.com/TMElyralab/lyraDiff}},
year = {2025}
}
License
lyraDiff-Flux.1-dev falls under the FLUX.1 [dev] Non-Commercial License.
- Downloads last month
- 6
