Generate a video from two images and text prompts
https://huggingface.co/papers/2501.03006
Generate images from text prompts
Generate images from text descriptions
Scalable and Versatile 3D Generation from images
Generate depth maps from images
Generate depth map from an image