Generate speech from text using a reference voice
watermark-free Modelscope-based video generation
Generate depth maps and 3D views from photos