Lifu Wang
commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,4 +7,20 @@ tags: []
|
|
| 7 |
|
| 8 |
Official Repository of the paper: *[Scaling Down Text Encoders of Text-to-Image Diffusion Models](https://github.com/LifuWang-66/ScalingDownTextEncoder/tree/main)*.
|
| 9 |
|
| 10 |
-
Project Page: https://lifuwang-66.github.io/ScalingDownTE/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
|
| 8 |
Official Repository of the paper: *[Scaling Down Text Encoders of Text-to-Image Diffusion Models](https://github.com/LifuWang-66/ScalingDownTextEncoder/tree/main)*.
|
| 9 |
|
| 10 |
+
Project Page: https://lifuwang-66.github.io/ScalingDownTE/
|
| 11 |
+
|
| 12 |
+
## Model Descriptions:
|
| 13 |
+
T5-Base distilled from [T5-XXL](https://huggingface.co/google/flan-t5-xxl) using [Flux](https://huggingface.co/runwayml/stable-diffusion-v1-5).
|
| 14 |
+
It is 50 times smaller and retains most capability of T5-XXL.
|
| 15 |
+
|
| 16 |
+
## Generation Results:
|
| 17 |
+
|
| 18 |
+
<p align="center">
|
| 19 |
+
<img src="teaser.png">
|
| 20 |
+
</p>
|
| 21 |
+
|
| 22 |
+
By distilling classifier-free guidance into the model's input, LCM can generate high-quality images in very short inference time. We compare the inference time at the setting of 768 x 768 resolution, CFG scale w=8, batchsize=4, using a A800 GPU.
|
| 23 |
+
|
| 24 |
+
<p align="center">
|
| 25 |
+
<img src="speed_fid.png">
|
| 26 |
+
</p>
|