Refine selected image regions with a text prompt
Generate a talking face video from an image and audio
Official Space for SpatialTrackerV2