UnityShots: Memory-Driven Multi-Shot Audio-Video Generation with Boundary-Aware Gating Paper • 2606.21661 • Published 7 days ago • 19
UnityShots: Memory-Driven Multi-Shot Audio-Video Generation with Boundary-Aware Gating Paper • 2606.21661 • Published 7 days ago • 19
ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions Paper • 2501.12173 • Published Jan 21, 2025 • 1
BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation Paper • 2505.06985 • Published May 11, 2025
LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation Paper • 2508.07603 • Published Aug 11, 2025 • 1
Zero-shot 3D-Aware Trajectory-Guided image-to-video generation via Test-Time Training Paper • 2509.06723 • Published Sep 8, 2025
UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation Paper • 2512.07831 • Published Dec 8, 2025 • 17
ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation Paper • 2512.03621 • Published Dec 3, 2025 • 9
From Inpainting to Editing: A Self-Bootstrapping Framework for Context-Rich Visual Dubbing Paper • 2512.25066 • Published Dec 31, 2025 • 5
EditInfinity: Image Editing with Binary-Quantized Generative Models Paper • 2510.20217 • Published Oct 23, 2025 • 1
Risk Awareness Injection: Calibrating Vision-Language Models for Safety without Compromising Utility Paper • 2602.03402 • Published Feb 3 • 1
VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models Paper • 2603.22003 • Published Mar 23 • 12
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Paper • 2404.16771 • Published Dec 28, 2024 • 19