Papers - Attention - Cross
updated
Vid2Robot: End-to-end Video-conditioned Policy Learning with
Cross-Attention Transformers
Paper
• 2403.12943
• Published • 15
Masked Audio Generation using a Single Non-Autoregressive Transformer
Paper
• 2401.04577
• Published • 44
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion
Models
Paper
• 2404.02747
• Published • 13
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image
Generation
Paper
• 2404.02733
• Published • 22
Prompt-to-Prompt Image Editing with Cross Attention Control
Paper
• 2208.01626
• Published • 3
Paper
• 2404.07821
• Published • 13
HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral
Denoising
Paper
• 2404.09697
• Published • 1
TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal
Large Language Models
Paper
• 2404.09204
• Published • 11
Long-form music generation with latent diffusion
Paper
• 2404.10301
• Published • 27
GLIGEN: Open-Set Grounded Text-to-Image Generation
Paper
• 2301.07093
• Published • 4
MultiBooth: Towards Generating All Your Concepts in an Image from Text
Paper
• 2404.14239
• Published • 9
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
Paper
• 2404.15420
• Published • 11
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation
Paper
• 2404.19427
• Published • 74
Unveiling Encoder-Free Vision-Language Models
Paper
• 2406.11832
• Published • 54
TokenFormer: Rethinking Transformer Scaling with Tokenized Model
Parameters
Paper
• 2410.23168
• Published • 24
HAT: Hybrid Attention Transformer for Image Restoration
Paper
• 2309.05239
• Published • 1
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper
• 2412.09871
• Published • 108