CLIP-powered Text-to-Image Search for finding images
VLMEvalKit Evaluation Results Collection
Process turtle images to detect, segment, and align faces