ldiex commited on
Commit
e8f231a
·
verified ·
1 Parent(s): c83afa1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -1
README.md CHANGED
@@ -4,4 +4,50 @@ base_model:
4
  - black-forest-labs/FLUX.1-dev
5
  - stabilityai/stable-diffusion-3.5-medium
6
  pipeline_tag: text-to-image
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - black-forest-labs/FLUX.1-dev
5
  - stabilityai/stable-diffusion-3.5-medium
6
  pipeline_tag: text-to-image
7
+ ---
8
+
9
+ <div align="center">
10
+ <h1>TACA: Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers</h1>
11
+ </div>
12
+
13
+ <div align="center">
14
+ <span class="author-block">
15
+ <a href="https://scholar.google.com/citations?user=FkkaUgwAAAAJ&hl=en" target="_blank">Zhengyao Lv*</a><sup>1</sup>,</span>
16
+ </span>
17
+ <span class="author-block">
18
+ <a href="https://tianlinn.com/" target="_blank">Tianlin Pan*</a><sup>2,3</sup>,</span>
19
+ </span>
20
+ <span class="author-block">
21
+ <a href="https://chenyangsi.github.io/" target="_blank">Chenyang Si</a><sup>2‡†</sup>,</span>
22
+ </span>
23
+ <span class="author-block">
24
+ <a href="https://frozenburning.github.io/" target="_blank">Zhaoxi Chen</a><sup>4</sup>,</span>
25
+ </span>
26
+ <span class="author-block">
27
+ <a href="https://homepage.hit.edu.cn/wangmengzuo" target="_blank">Wangmeng Zuo</a><sup>5</sup>,</span>
28
+ </span>
29
+ <span class="author-block">
30
+ <a href="https://liuziwei7.github.io/" target="_blank">Ziwei Liu</a><sup>4†</sup>,</span>
31
+ </span>
32
+ <span class="author-block">
33
+ <a href="https://i.cs.hku.hk/~kykwong/" target="_blank">Kwan-Yee K. Wong</a><sup>1†</sup>
34
+ </span>
35
+ </div>
36
+
37
+ <div align="center">
38
+ <sup>1</sup>The University of Hong Kong &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
39
+ <sup>2</sup>Nanjing University <br>
40
+ <sup>3</sup>University of Chinese Academy of Sciences &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
41
+ <sup>4</sup>Nanyang Technological University<br>
42
+ <sup>5</sup>Harbin Institute of Technology
43
+ </div>
44
+ <div align="center">(*Equal Contribution.&nbsp;&nbsp;&nbsp;&nbsp;<sup>‡</sup>Project Leader.&nbsp;&nbsp;&nbsp;&nbsp;<sup>†</sup>Corresponding Author.)</div>
45
+
46
+ <p align="center">
47
+ <a href="https://arxiv.org/abs/">Paper</a> |
48
+ <a href="https://vchitect.github.io/TACA/">Project Page</a> |
49
+ <a href="https://huggingface.co/ldiex/TACA/tree/main">LoRA Weights</a>
50
+ </p>
51
+
52
+ # About
53
+ We propose **TACA**, a parameter-efficient method that dynamically rebalances cross-modal attention in multimodal diffusion transformers to improve text-image alignment.