UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper • 2605.00658 • Published 13 days ago • 81
Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer Paper • 2603.19227 • Published Mar 19 • 42
FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-Language Navigation Paper • 2601.13976 • Published Jan 20 • 22
view article Article New ViT and ALIGN Models From Kakao Brain +2 adirik, Unso, dylan-m, jun-untitled • Mar 6, 2023 • 6
stabilityai/stable-diffusion-3.5-large-controlnet-canny Text-to-Image • Updated Nov 28, 2024 • 198 • 15