Generate video from audio and image
wan 2.2 alibaba
Generate tags for images using Waifu Diffusion models
A Generalist Diffusion Model for Vision Perception