Convert text prompts and images into spoken responses
Generate a meaningful conversation about an uploaded image