# CXR-VLM: Unified Vision-Language Model for Chest X-ray Interpretation A lightweight Vision-Language Model for **three tasks** on chest X-rays: 1. **Findings Generation** — detailed radiological findings 2. **Impression Generation** — concise clinical summary 3. **Visual Question Answering (VQA)** — answer specific clinical questions ## Architecture (based on RaDialog) ``` CXR Image │ BioViL-T Encoder (frozen) ← domain-specific CXR encoder │ [768-dim patch features] MLP Projection Layer (trained) ← align to LLM space │ [32 image tokens] + CheXpert Findings (structured labels, optional) + Task Instruction Prompt │ Vicuna-7B + LoRA (LLM trained with LoRA) │ Output Text (findings / impression / answer) ``` ## Project Structure ``` cxr_vlm/ ├── configs/ │ ├── model_config.yaml # model hyperparameters │ └── train_config.yaml # training hyperparameters ├── model/ │ ├── __init__.py │ ├── image_encoder.py # BioViL-T wrapper │ ├── projection.py # MLP alignment layer │ ├── chexpert_classifier.py # CheXpert structured findings classifier │ └── cxr_vlm.py # full model (encoder + projection + LLM) ├── data/ │ ├── __init__.py │ ├── dataset.py # CXRInstructDataset (load later) │ ├── prompt_templates.py # instruction templates for 3 tasks │ └── collator.py # DataCollator for variable-length inputs ├── training/ │ ├── __init__.py │ ├── trainer.py # custom HuggingFace Trainer │ └── train.py # main training entry point ├── evaluation/ │ ├── __init__.py │ ├── metrics.py # BLEU, ROUGE, ClinicalF1, BERTScore │ └── evaluate.py # evaluation entry point ├── utils/ │ ├── __init__.py │ ├── logger.py # logging setup │ └── checkpoint.py # save/load utilities ├── scripts/ │ ├── train.sh # training shell script │ └── evaluate.sh # evaluation shell script └── README.md ``` ## Setup ```bash conda create -n cxr_vlm python=3.10 conda activate cxr_vlm conda install pytorch==2.0.1 torchvision==0.15.2 pytorch-cuda=11.7 -c pytorch -c nvidia pip install -r requirements.txt ``` ## Training ```bash bash scripts/train.sh ``` ## Evaluation ```bash bash scripts/evaluate.sh ```