Voice conversion framework based on VITS
Generate depth map from an image
Generate a 3D mesh model from an image