TMElyralab
/

lyraDiff-Flux.1-dev

Model card Files Files and versions

lyraDiff-Flux.1-dev / README.md

yibolu96's picture

Update README.md

6aed114 verified 9 months ago

|

history blame contribute delete

2.76 kB

	---
	license: apache-2.0
	---
	<div align="center">

	<h1> lyraDiff: An Out-of-the-box Acceleration Engine for Diffusion and DiT Models</h1>

	</div>


	`lyraDiff` introduces a recompilation-free inference engine for Diffusion and DiT models, achieving state-of-the-art speed, extensive model support, and pixel-level image consistency.

	## Highlights
	- State-of-the-art Inference Speed: `lyraDiff` utilizes multiple techniques to achieve up to 6.1x speedup of the model inference, including Quantization, Fused GEMM Kernels, Flash Attention, and NHWC & Fused GroupNorm.
	- Memory Efficiency: `lyraDiff` utilizes buffer-based DRAM reuse strategy and multiple types of quantizations (FP8/INT8/INT4) to save 10-40% of DRAM usage.
	- Extensive Model Support: `lyraDiff` supports a wide range of top Generative/SR models such as SD1.5, SDXL, FLUX, S3Diff, etc., and those most commonly used plugins such as LoRA, ControlNet and Ip-Adapter.
	- Zero Compilation Deployment: Unlike TensorRT or AITemplate, which takes minutes to compile, `lyraDiff` eliminates runtime recompilation overhead even with model inputs of dynamic shapes.
	- Image Gen Consistency: The outputs of `lyraDiff` are aligned with the ones of [HF diffusers](https://github.com/huggingface/diffusers) at the pixel level, even under LoRA switch in quantization mode.
	- Fast Plugin Hot-swap: `lyraDiff` provides Super Fast Model Hot-swap for ControlNet and LoRA which can hugely benefit a real-time image gen service.

	## Usage

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6461b412846a6c8c8305319d/41_dvtx232Kzu8MkY6qEx.png)

	`lyraDiff-Flux.1-dev` is converted from the standard [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) model weights using this [script](https://github.com/TMElyralab/lyraDiff/blob/main/lyradiff/convert_model_scripts/quantize.py) to be compatiable with [lyraDiff](https://github.com/TMElyralab/lyraDiff), and contains both `FP8` and `FP16` version of converted Flux.1-dev

	We provide a reference implementation of lyraDiff version of Flux.1-dev, as well as sampling code, in a dedicated [github repository](https://github.com/TMElyralab/lyraDiff).

	## Citation
	``` bibtex
	@Misc{lyraDiff_2025,
	author = {Yibo Lu, Sa Xiao, Kangjian Wu, Bin Wu, Mian Peng, Haoxiong Su, Qiwen Mao, Wenjiang Zhou},
	title = {lyraDiff: An Out-of-the-box Acceleration Engine for Diffusion and DiT Models},
	howpublished = {\url{https://github.com/TMElyralab/lyraDiff}},
	year = {2025}
	}
	```

	## License
	`lyraDiff-Flux.1-dev` falls under the [`FLUX.1 [dev]` Non-Commercial License](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md).