using the llama model
implementing an image caption
detecting a picture
To do classification like what the company marqo did