Generate images from text prompts
Generate dialogue from English context
Generate Chinese dialogue from context
Extract entities and their types from Chinese questions
Segment objects in images using points or text