CxREmbed (multi-image / multi-text unified embeddings)

This repository contains lightweight inference code + trained embedding heads for a multi-modal CXR embedding model built on top of the base Lingshu-7B / Qwen2.5-VL backbone.

The repo is structured to upload only the delta weights (LoRA adapter + pooling/projection heads). The base model weights remain in the original upstream repository.

What is included

lora/ (optional) — PEFT LoRA adapter weights
unified_pooler.pt — pooling head
unified_proj.pt — projection head to the unified embedding space
text_proj.pt / image_proj.pt (optional)
cxrembed_config.json — minimal configuration
cxrembed/ — small Python package with an inference wrapper

Quickstart

import torch
from cxrembed import CxREmbedder

# Download from the Hub and load the backbone + adapters + heads
m = CxREmbedder.from_pretrained(
    "<ORG>/<REPO>",
    device="cuda" if torch.cuda.is_available() else "cpu",
    amp=True,
)

# Embed a structured record (multi-image + multi-text)
emb = m.embed_record(
    current_img="/path/to/current_frontal.png",
    lateral_img="/path/to/lateral.png",
    prior_img="/path/to/prior.png",
    additional_img=None,
    prior_report="...",
    current_report="...",
    demographics="Age 67, male",
    lab_test="WBC 12.3",
    history="SOB, fever",
    additional_txt="Question: pneumonia?",
    instruction="Embed this clinical record for retrieval.",
)

# Embed a candidate answer (text-only)
ans = m.embed_answer("Right lower lobe consolidation consistent with pneumonia.")

# Similarity in embedding space
score = float((emb @ ans.T).item())
print(score)

Placeholders supported in templates

Images:

<current_image> (alias of <frontal_image>)
<lateral_image>
<prior_image>
<additional_image>
<additional_image1>, <additional_image2>, ... if you pass a list to additional_img

Texts:

<current_report> (alias <report>)
<prior_report>
<demographics>
<lab_test>
<history>
<additional_txt>

Notes

This model is intended for research and may require additional validation for clinical use.
Do not upload protected health information (PHI) to public repositories.