Generates a sound effect that matches video shot
Generate speech in a cloned voice from a short audio sample