|
|
--- |
|
|
license: apache-2.0 |
|
|
--- |
|
|
<div align="center"> |
|
|
|
|
|
<h1> lyraDiff: An Out-of-the-box Acceleration Engine for Diffusion and DiT Models</h1> |
|
|
|
|
|
</div> |
|
|
|
|
|
|
|
|
`lyraDiff` introduces a **recompilation-free** inference engine for Diffusion and DiT models, achieving **state-of-the-art speed**, **extensive model support**, and **pixel-level image consistency**. |
|
|
|
|
|
## Highlights |
|
|
- **State-of-the-art Inference Speed**: `lyraDiff` utilizes multiple techniques to achieve up to **6.1x** speedup of the model inference, including **Quantization**, **Fused GEMM Kernels**, **Flash Attention**, and **NHWC & Fused GroupNorm**. |
|
|
- **Memory Efficiency**: `lyraDiff` utilizes buffer-based DRAM reuse strategy and multiple types of quantizations (FP8/INT8/INT4) to save **10-40%** of DRAM usage. |
|
|
- **Extensive Model Support**: `lyraDiff` supports a wide range of top Generative/SR models such as **SD1.5, SDXL, FLUX, S3Diff, etc.**, and those most commonly used plugins such as **LoRA, ControlNet and Ip-Adapter**. |
|
|
- **Zero Compilation Deployment**: Unlike **TensorRT** or **AITemplate**, which takes minutes to compile, `lyraDiff` eliminates runtime recompilation overhead even with model inputs of dynamic shapes. |
|
|
- **Image Gen Consistency**: The outputs of `lyraDiff` are aligned with the ones of [HF diffusers](https://github.com/huggingface/diffusers) at the pixel level, even under LoRA switch in quantization mode. |
|
|
- **Fast Plugin Hot-swap**: `lyraDiff` provides **Super Fast Model Hot-swap for ControlNet and LoRA** which can hugely benefit a real-time image gen service. |
|
|
|
|
|
## Usage |
|
|
|
|
|
 |
|
|
|
|
|
`lyraDiff-Flux.1-dev` is converted from the standard [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) model weights using this [script](https://github.com/TMElyralab/lyraDiff/blob/main/lyradiff/convert_model_scripts/quantize.py) to be compatiable with [lyraDiff](https://github.com/TMElyralab/lyraDiff), and contains both `FP8` and `FP16` version of converted Flux.1-dev |
|
|
|
|
|
We provide a reference implementation of lyraDiff version of Flux.1-dev, as well as sampling code, in a dedicated [github repository](https://github.com/TMElyralab/lyraDiff). |
|
|
|
|
|
## Citation |
|
|
``` bibtex |
|
|
@Misc{lyraDiff_2025, |
|
|
author = {Yibo Lu, Sa Xiao, Kangjian Wu, Bin Wu, Mian Peng, Haoxiong Su, Qiwen Mao, Wenjiang Zhou}, |
|
|
title = {lyraDiff: An Out-of-the-box Acceleration Engine for Diffusion and DiT Models}, |
|
|
howpublished = {\url{https://github.com/TMElyralab/lyraDiff}}, |
|
|
year = {2025} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
`lyraDiff-Flux.1-dev` falls under the [`FLUX.1 [dev]` Non-Commercial License](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md). |
|
|
|