| --- |
| tags: |
| - chest-xray |
| - radiology |
| - report-generation |
| - mimic-cxr |
| license: apache-2.0 |
| --- |
| |
| # LAPVQA — Radiology Report Generation (Captioning-Pretrained Encoder) |
|
|
| Part of the [LAPVQA collection](https://huggingface.co/collections/dmusingu/lapvqa). |
|
|
| ## Description |
|
|
| RRG decoder trained on the frozen **LAPVQA captioning-pretrained encoder** |
| ([`lapvqa-pretrain-captioning`](https://huggingface.co/dmusingu/lapvqa-pretrain-captioning)). |
| Checkpoint format: `{state_dict, vis_dim, d_model, num_layers, nhead, encoder, epoch, val_bleu4}`. |
|
|
| ## Loading |
|
|
| ```python |
| import torch |
| import tiktoken |
| from lapvqa.rrg.heads import ReportGenerationHead |
| |
| ckpt = torch.load("pretrain-captioning.pt", map_location="cpu") |
| head = ReportGenerationHead( |
| vis_dim = ckpt["vis_dim"], |
| d_model = ckpt["d_model"], |
| num_layers = ckpt["num_layers"], |
| nhead = ckpt["nhead"], |
| ) |
| head.load_state_dict(ckpt["state_dict"]) |
| head.eval() |
| |
| enc = tiktoken.get_encoding("gpt2") |
| bos_id = eos_id = enc.eot_token |
| # pair with encoder_final.pt from lapvqa-pretrain-captioning |
| token_ids = head.generate(vis_tokens, bos_id=bos_id, eos_id=eos_id) |
| reports = [enc.decode(ids) for ids in token_ids] |
| ``` |
|
|