Generate a talking video from a single image and audio
Generate lip‑synced video from audio and a reference clip