dmusingu's picture
Update README with model loading code
3be30cf verified
|
Raw
History Blame Contribute Delete
930 Bytes
metadata
tags:
  - chest-xray
  - radiology
  - visual-question-answering
  - differential-vqa
  - mimic-cxr
license: apache-2.0

LAPVQA — Differential VQA (Captioning-Pretrained Encoder)

Part of the LAPVQA collection.

Description

DiffVQA head trained on the frozen LAPVQA captioning-pretrained encoder (lapvqa-pretrain-captioning). Checkpoint is a plain DiffVQAHead state dict (vis_dim=1024).

Results (test set)

BLEU-4 ROUGE-2 RadGraph-s BERTScore F1
0.468 0.562 0.303 0.938

Loading

import torch
from lapvqa.diffvqa.model import DiffVQAHead

ckpt = torch.load("pretrain-captioning_best.pt", map_location="cpu")
head = DiffVQAHead(vis_dim=1024)
head.load_state_dict(ckpt)
head.eval()
# pair with encoder_final.pt from lapvqa-pretrain-captioning