dmusingu's picture
Update README with model loading code
06f5b66 verified
|
Raw
History Blame Contribute Delete
1.18 kB
---
tags:
- chest-xray
- radiology
- report-generation
- mimic-cxr
license: apache-2.0
---
# LAPVQA — Radiology Report Generation (Captioning-Pretrained Encoder)
Part of the [LAPVQA collection](https://huggingface.co/collections/dmusingu/lapvqa).
## Description
RRG decoder trained on the frozen **LAPVQA captioning-pretrained encoder**
([`lapvqa-pretrain-captioning`](https://huggingface.co/dmusingu/lapvqa-pretrain-captioning)).
Checkpoint format: `{state_dict, vis_dim, d_model, num_layers, nhead, encoder, epoch, val_bleu4}`.
## Loading
```python
import torch
import tiktoken
from lapvqa.rrg.heads import ReportGenerationHead
ckpt = torch.load("pretrain-captioning.pt", map_location="cpu")
head = ReportGenerationHead(
vis_dim = ckpt["vis_dim"],
d_model = ckpt["d_model"],
num_layers = ckpt["num_layers"],
nhead = ckpt["nhead"],
)
head.load_state_dict(ckpt["state_dict"])
head.eval()
enc = tiktoken.get_encoding("gpt2")
bos_id = eos_id = enc.eot_token
# pair with encoder_final.pt from lapvqa-pretrain-captioning
token_ids = head.generate(vis_tokens, bos_id=bos_id, eos_id=eos_id)
reports = [enc.decode(ids) for ids in token_ids]
```