When repurposing a T2I model into a pure I2I model, there’s always that orphaned text path — what do we do with it? 🤔
You can reuse it as learnable embeddings in multi-task setups [2], freeze an empty text prompt, distillate or prune the corresponding part.
In LBM, they take a clever route — zeroing [3] and reshaping [4] the text-related cross-attentions into self-attentions. This gives you fresh weights for I2I computation, nicely integrated into your SD architecture.