Update README.md
Browse files
README.md
CHANGED
|
@@ -1,34 +1,27 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
| 4 |
-
<
|
| 5 |
-
<div align="center">
|
| 6 |
-
Sa Xiao<sup>*</sup>, Yibo Lu<sup>*</sup>, Kangjian Wu<sup>*</sup>, Bin Wu<sup>†</sup>, Haoxiong Su, Mian Peng, Qiwen Mao, Wenjiang Zhou</br>(*co-first author), (†Corresponding Author, benbinwu@tencent.com)</br>
|
| 7 |
-
Lyra Lab, Tencent Music Entertainment</br>
|
| 8 |
-
<p>[<a href="https://github.com/TMElyralab/lyraDiff">github</a>] </p>
|
| 9 |
-
</div>
|
| 10 |
|
| 11 |
-
|
| 12 |
|
| 13 |
-
|
| 14 |
|
| 15 |
-
The core features include:
|
| 16 |
-
- 🚀 **State-of-the-art Inference Speed**: `lyraDiff` utilizes multiple techniques to achieve up to 2x speedup of the model inference, including **Quantization**, **Fused GEMM Kernels**, **Flash Attention**, and **NHWC & Fused GroupNorm**.
|
| 17 |
-
- 🔥 **Memory Efficiency**: `lyraDiff` utilizes buffer-based DRAM reuse strategy and multiple types of quantizations (FP8/INT8/INT4) to save **10-40%** of DRAM usage.
|
| 18 |
-
- 🔥 **Extensive Model Support**: `lyraDiff` supports a wide range of Generative/SR models such as **SD1.5, SDXL, FLUX, S3Diff, SUPIR, etc.**, and those most commonly used plugins such as **LoRA, ControlNet and Ip-Adapter**.
|
| 19 |
-
- 🔥 **Zero Compilation Deployment**: Unlike **TensorRT** or **AITemplate**, which takes minutes to compile, `lyraDiff` eliminates runtime recompilation overhead even with model inputs of dynamic shapes.
|
| 20 |
-
- 🔥 **Image Gen Consistency**: The outputs of `lyraDiff` are aligned with the ones of [HF diffusers](https://github.com/huggingface/diffusers) at the pixel level, even under LoRA switch in quantization mode.
|
| 21 |
-
- 🚀 **Fast Plugin Hot-swap**: `lyraDiff` provides **Super Fast Model Hot-swap for ControlNet and LoRA** which can hugely benefit a real-time image gen service.
|
| 22 |
|
|
|
|
| 23 |
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
## Usage
|
| 29 |
|
| 30 |
-
|
| 31 |
-
|
| 32 |
We provide a reference implementation of lyraDiff version of SD1.5/SDXL, as well as sampling code, in a dedicated [github repository](https://github.com/TMElyralab/lyraDiff).
|
| 33 |
|
| 34 |
### Example
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
| 4 |
+
<div align="center">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
|
| 6 |
+
<h1> lyraDiff: An Out-of-the-box Acceleration Engine for Diffusion and DiT Models</h1>
|
| 7 |
|
| 8 |
+
</div>
|
| 9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
+
`lyraDiff` introduces a **recompilation-free** inference engine for Diffusion and DiT models, achieving **state-of-the-art speed**, **extensive model support**, and **pixel-level image consistency**.
|
| 12 |
|
| 13 |
+
## Highlights
|
| 14 |
+
- **State-of-the-art Inference Speed**: `lyraDiff` utilizes multiple techniques to achieve up to **6.1x** speedup of the model inference, including **Quantization**, **Fused GEMM Kernels**, **Flash Attention**, and **NHWC & Fused GroupNorm**.
|
| 15 |
+
- **Memory Efficiency**: `lyraDiff` utilizes buffer-based DRAM reuse strategy and multiple types of quantizations (FP8/INT8/INT4) to save **10-40%** of DRAM usage.
|
| 16 |
+
- **Extensive Model Support**: `lyraDiff` supports a wide range of top Generative/SR models such as **SD1.5, SDXL, FLUX, S3Diff, etc.**, and those most commonly used plugins such as **LoRA, ControlNet and Ip-Adapter**.
|
| 17 |
+
- **Zero Compilation Deployment**: Unlike **TensorRT** or **AITemplate**, which takes minutes to compile, `lyraDiff` eliminates runtime recompilation overhead even with model inputs of dynamic shapes.
|
| 18 |
+
- **Image Gen Consistency**: The outputs of `lyraDiff` are aligned with the ones of [HF diffusers](https://github.com/huggingface/diffusers) at the pixel level, even under LoRA switch in quantization mode.
|
| 19 |
+
- **Fast Plugin Hot-swap**: `lyraDiff` provides **Super Fast Model Hot-swap for ControlNet and LoRA** which can hugely benefit a real-time image gen service.
|
| 20 |
|
| 21 |
## Usage
|
| 22 |
|
| 23 |
+
`lyraDiff-IP-Adapters` is converted from the standard [IP-Adapter](https://huggingface.co/h94/IP-Adapter) weights using this [script](https://github.com/TMElyralab/lyraDiff/blob/main/lyradiff/convert_model_scripts/convert_ipadapter.py) to be compatiable with [lyraDiff](https://github.com/TMElyralab/lyraDiff), and contains both SD1.5 and SDXL version of converted IP-Adapter
|
| 24 |
+
|
| 25 |
We provide a reference implementation of lyraDiff version of SD1.5/SDXL, as well as sampling code, in a dedicated [github repository](https://github.com/TMElyralab/lyraDiff).
|
| 26 |
|
| 27 |
### Example
|