tencent/HunyuanImage-3.0
Text-to-Image
β’
83B
β’
Updated
β’
635k
β’
β’
636
Generate depth video from input video
Audio Conditioned LipSync with Latent Diffusion Models
Generate new person images with swapped clothes or poses
Generate audio from video and text prompts