TACA / README.md
nielsr's picture
nielsr HF Staff
Add link to paper
a8dec8a verified
|
raw
history blame
2.31 kB
metadata
base_model:
  - black-forest-labs/FLUX.1-dev
  - stabilityai/stable-diffusion-3.5-medium
library_name: diffusers
license: mit
pipeline_tag: text-to-image

TACA: Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

1The University of Hong Kong       2Nanjing University
3University of Chinese Academy of Sciences       4Nanyang Technological University
5Harbin Institute of Technology
(*Equal Contribution.    Project Leader.    Corresponding Author.)

Paper | Project Page | LoRA Weights | Code

About

We propose TACA, a parameter-efficient method that dynamically rebalances cross-modal attention in multimodal diffusion transformers to improve text-image alignment.