CxREmbed (multi-image / multi-text unified embeddings)
This repository contains lightweight inference code + trained embedding heads for a multi-modal CXR embedding model built on top of the base Lingshu-7B / Qwen2.5-VL backbone.
The repo is structured to upload only the delta weights (LoRA adapter + pooling/projection heads). The base model weights remain in the original upstream repository.
What is included
lora/(optional) — PEFT LoRA adapter weightsunified_pooler.pt— pooling headunified_proj.pt— projection head to the unified embedding spacetext_proj.pt/image_proj.pt(optional)cxrembed_config.json— minimal configurationcxrembed/— small Python package with an inference wrapper
Quickstart
import torch
from cxrembed import CxREmbedder
# Download from the Hub and load the backbone + adapters + heads
m = CxREmbedder.from_pretrained(
"<ORG>/<REPO>",
device="cuda" if torch.cuda.is_available() else "cpu",
amp=True,
)
# Embed a structured record (multi-image + multi-text)
emb = m.embed_record(
current_img="/path/to/current_frontal.png",
lateral_img="/path/to/lateral.png",
prior_img="/path/to/prior.png",
additional_img=None,
prior_report="...",
current_report="...",
demographics="Age 67, male",
lab_test="WBC 12.3",
history="SOB, fever",
additional_txt="Question: pneumonia?",
instruction="Embed this clinical record for retrieval.",
)
# Embed a candidate answer (text-only)
ans = m.embed_answer("Right lower lobe consolidation consistent with pneumonia.")
# Similarity in embedding space
score = float((emb @ ans.T).item())
print(score)
Placeholders supported in templates
Images:
<current_image>(alias of<frontal_image>)<lateral_image><prior_image><additional_image><additional_image1>,<additional_image2>, ... if you pass a list toadditional_img
Texts:
<current_report>(alias<report>)<prior_report><demographics><lab_test><history><additional_txt>
Notes
- This model is intended for research and may require additional validation for clinical use.
- Do not upload protected health information (PHI) to public repositories.