LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas Paper • 2510.20820 • Published Oct 23 • 10
Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published about 1 month ago • 35
Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network Paper • 2008.11423 • Published Aug 26, 2020
Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization Paper • 2512.10955 • Published 16 days ago • 6
Multi-subject Open-set Personalization in Video Generation Paper • 2501.06187 • Published Jan 10 • 14
VIMI: Grounding Video Generation through Multi-modal Instruction Paper • 2407.06304 • Published Jul 8, 2024 • 10
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Paper • 2402.19479 • Published Feb 29, 2024 • 35
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis Paper • 2402.14797 • Published Feb 22, 2024 • 21