Voice conversion framework based on VITS
Generate depth maps from any input photo
Generate a 3D mesh from a single image