A Systematic Study of Cross-Modal Typographic Attacks on Audio-Visual Reasoning Paper • 2604.03995 • Published 8 days ago • 4
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published 4 days ago • 108
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published 4 days ago • 91
Paper Espresso: From Paper Overload to Research Insight Paper • 2604.04562 • Published 7 days ago • 10