# CxREmbed (multi-image / multi-text unified embeddings) This repository contains **lightweight inference code + trained embedding heads** for a multi-modal CXR embedding model built on top of the base **Lingshu-7B / Qwen2.5-VL** backbone. The repo is structured to upload only the *delta weights* (LoRA adapter + pooling/projection heads). The base model weights remain in the original upstream repository. ## What is included - `lora/` (optional) — PEFT LoRA adapter weights - `unified_pooler.pt` — pooling head - `unified_proj.pt` — projection head to the unified embedding space - `text_proj.pt` / `image_proj.pt` (optional) - `cxrembed_config.json` — minimal configuration - `cxrembed/` — small Python package with an inference wrapper ## Quickstart ```python import torch from cxrembed import CxREmbedder # Download from the Hub and load the backbone + adapters + heads m = CxREmbedder.from_pretrained( "/", device="cuda" if torch.cuda.is_available() else "cpu", amp=True, ) # Embed a structured record (multi-image + multi-text) emb = m.embed_record( current_img="/path/to/current_frontal.png", lateral_img="/path/to/lateral.png", prior_img="/path/to/prior.png", additional_img=None, prior_report="...", current_report="...", demographics="Age 67, male", lab_test="WBC 12.3", history="SOB, fever", additional_txt="Question: pneumonia?", instruction="Embed this clinical record for retrieval.", ) # Embed a candidate answer (text-only) ans = m.embed_answer("Right lower lobe consolidation consistent with pneumonia.") # Similarity in embedding space score = float((emb @ ans.T).item()) print(score) ``` ## Placeholders supported in templates Images: - `` (alias of ``) - `` - `` - `` - ``, ``, ... if you pass a list to `additional_img` Texts: - `` (alias ``) - `` - `` - `` - `` - `` ## Notes - This model is intended for **research** and may require additional validation for clinical use. - Do not upload protected health information (PHI) to public repositories.