hieu3636
/

cxr-vlm-code

Model card Files Files and versions

cxr-vlm-code / README.md

convitom

initial commit

28b13fc 19 days ago

|

history blame contribute delete

2.58 kB

	# CXR-VLM: Unified Vision-Language Model for Chest X-ray Interpretation

	A lightweight Vision-Language Model for three tasks on chest X-rays:
	1. Findings Generation — detailed radiological findings
	2. Impression Generation — concise clinical summary
	3. Visual Question Answering (VQA) — answer specific clinical questions

	## Architecture (based on RaDialog)

	```
	CXR Image
	│
	BioViL-T Encoder (frozen) ← domain-specific CXR encoder
	│ [768-dim patch features]
	MLP Projection Layer (trained) ← align to LLM space
	│ [32 image tokens]
	+ CheXpert Findings (structured labels, optional)
	+ Task Instruction Prompt
	│
	Vicuna-7B + LoRA (LLM trained with LoRA)
	│
	Output Text (findings / impression / answer)
	```

	## Project Structure

	```
	cxr_vlm/
	├── configs/
	│ ├── model_config.yaml # model hyperparameters
	│ └── train_config.yaml # training hyperparameters
	├── model/
	│ ├── __init__.py
	│ ├── image_encoder.py # BioViL-T wrapper
	│ ├── projection.py # MLP alignment layer
	│ ├── chexpert_classifier.py # CheXpert structured findings classifier
	│ └── cxr_vlm.py # full model (encoder + projection + LLM)
	├── data/
	│ ├── __init__.py
	│ ├── dataset.py # CXRInstructDataset (load later)
	│ ├── prompt_templates.py # instruction templates for 3 tasks
	│ └── collator.py # DataCollator for variable-length inputs
	├── training/
	│ ├── __init__.py
	│ ├── trainer.py # custom HuggingFace Trainer
	│ └── train.py # main training entry point
	├── evaluation/
	│ ├── __init__.py
	│ ├── metrics.py # BLEU, ROUGE, ClinicalF1, BERTScore
	│ └── evaluate.py # evaluation entry point
	├── utils/
	│ ├── __init__.py
	│ ├── logger.py # logging setup
	│ └── checkpoint.py # save/load utilities
	├── scripts/
	│ ├── train.sh # training shell script
	│ └── evaluate.sh # evaluation shell script
	└── README.md
	```

	## Setup

	```bash
	conda create -n cxr_vlm python=3.10
	conda activate cxr_vlm
	conda install pytorch==2.0.1 torchvision==0.15.2 pytorch-cuda=11.7 -c pytorch -c nvidia
	pip install -r requirements.txt
	```

	## Training

	```bash
	bash scripts/train.sh
	```

	## Evaluation

	```bash
	bash scripts/evaluate.sh
	```