PaliGemma2 LoRA finetuned on VQAv2
Find keypoints in images
Vision Transformer Attention Visualization