--- tags: - chest-xray - radiology - report-generation - mimic-cxr license: apache-2.0 --- # LAPVQA — Radiology Report Generation (Captioning-Pretrained Encoder) Part of the [LAPVQA collection](https://huggingface.co/collections/dmusingu/lapvqa). ## Description RRG decoder trained on the frozen **LAPVQA captioning-pretrained encoder** ([`lapvqa-pretrain-captioning`](https://huggingface.co/dmusingu/lapvqa-pretrain-captioning)). Checkpoint format: `{state_dict, vis_dim, d_model, num_layers, nhead, encoder, epoch, val_bleu4}`. ## Loading ```python import torch import tiktoken from lapvqa.rrg.heads import ReportGenerationHead ckpt = torch.load("pretrain-captioning.pt", map_location="cpu") head = ReportGenerationHead( vis_dim = ckpt["vis_dim"], d_model = ckpt["d_model"], num_layers = ckpt["num_layers"], nhead = ckpt["nhead"], ) head.load_state_dict(ckpt["state_dict"]) head.eval() enc = tiktoken.get_encoding("gpt2") bos_id = eos_id = enc.eot_token # pair with encoder_final.pt from lapvqa-pretrain-captioning token_ids = head.generate(vis_tokens, bos_id=bos_id, eos_id=eos_id) reports = [enc.decode(ids) for ids in token_ids] ```