Generate detailed prompts from any image
Generate depth map from any input image
Image to 3D with DPT + 3D Point Cloud