Video Understanding with Interleaved Visual-Textual Tokens
Create simple animations with doodles
Create a personalized video of your face in any camera shot
Generate images from text prompts