dmusingu
/

lapvqa-vqa-native

+---
+tags:
+- chest-xray
+- radiology
+- visual-question-answering
+- mimic-cxr
+license: apache-2.0
+---
+# LAPVQA — VQA (Native / End-to-end)
+Part of the [LAPVQA collection](https://huggingface.co/collections/dmusingu/lapvqa).
+## Description
+VQA task heads trained with **end-to-end fine-tuning** — the encoder weights are
+updated jointly with the task head, providing a baseline for how much improvement
+domain adaptation yields over the frozen-encoder setup in [`lapvqa-vqa`](https://huggingface.co/dmusingu/lapvqa-vqa).
+## Files
+| File | Encoder backbone |
+|---|---|
+| `clip-vit-l14_best.pt` | CLIP ViT-L/14 (fine-tuned) |
+| `siglip_best.pt` | SigLIP (fine-tuned) |
+| `florence2_best.pt` | Florence-2 (fine-tuned) |
+| `coca_best.pt` | CoCa (fine-tuned) |
+| `mae-vit-l16_best.pt` | MAE ViT-L/16 (fine-tuned) |