nowsyn
/

anycontrol

Safetensors

Model card Files Files and versions

xet

Community

nowsyn commited on Jul 1, 2024

Commit

d683f88

verified ·

1 Parent(s): 17e23f2

Update README.md

Browse files

Files changed (1) hide show

README.md +53 -3

README.md CHANGED Viewed

@@ -1,3 +1,53 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+---
+license: apache-2.0
+---
+# AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation
+[Yanan Sun](https://scholar.google.com/citations?user=6TA1oPkAAAAJ&hl=en), Yanchen Liu, Yinhao Tang, [Wenjie Pei](https://wenjiepei.github.io/) and [Kai Chen*](https://chenkai.site/)
+**Shanghai AI Laboratory**
+![](./assets/teaser.png "AnyControl")
+## Overview
+The field of text-to-image (T2I) generation has made significant progress in recent years,
+largely driven by advancements in diffusion models.
+Linguistic control enables effective content creation, but struggles with fine-grained control over image generation.
+This challenge has been explored, to a great extent, by incorporating additional usersupplied spatial conditions,
+such as depth maps and edge maps, into pre-trained T2I models through extra encoding.
+However, multi-control image synthesis still faces several challenges.
+Specifically, current approaches are limited in handling free combinations of diverse input control signals,
+overlook the complex relationships among multiple spatial conditions, and often fail to maintain semantic alignment with provided textual prompts.
+This can lead to suboptimal user experiences. To address these challenges, we propose AnyControl,
+a multi-control image synthesis framework that supports arbitrary combinations of diverse control signals.
+AnyControl develops a novel Multi-Control Encoder that extracts a unified multi-modal embedding to guide the generation process.
+This approach enables a holistic understanding of user inputs, and produces high-quality,
+faithful results under versatile control signals, as demonstrated by extensive quantitative and qualitative evaluations.
+## Model Card
+AnyControl for SD 1.5
+- `ckpts/anycontrol_15.ckpt`: weights for AnyControl.
+- `ckpts/init_local.ckpt`: initial weights of AnyControl during training, generated following [Uni-ControlNet](https://github.com/ShihaoZhaoZSH/Uni-ControlNet).
+- `ckpts/blip2_pretrained.pth`: third-party model.
+- `annotator/ckpts`: third-party models used in annotators.
+## License and Citation
+All models and assets are under the [Apache 2.0 license](./LICENSE) unless specified otherwise.
+If this work is helpful for your research, please consider citing the following BibTeX entry.
+``` bibtex
+@misc{sun2024anycontrol,
+  title={AnyControl: Create your artwork with versatile control on text-to-image generation},
+  author={Sun, Yanan and Liu, Yanchen and Tang, Yinhao and Pei, Wenjie and Chen, Kai},
+  booktitle={ECCV},
+  year={2024}
+}
+```