Detect, segment, classify objects in images and videos
Describe images and extract text with Florence-2