|
|
--- |
|
|
license: mit |
|
|
metrics: |
|
|
- bertscore |
|
|
- bleu |
|
|
- rouge |
|
|
base_model: |
|
|
- microsoft/phi-2 |
|
|
--- |
|
|
# Dual-View SLaVA-CXR |
|
|
|
|
|
**Dual-View SLaVA-CXR** is a vision-language model for structured radiology report generation from frontal and lateral chest X-rays. Built on the ReΒ³ (RecognizeβReasonβReport) paradigm and extending the original SLaVA-CXR model, this project integrates dual-view vision fusion and leverages CLIP, BiomedCLIP, and Phi-2 for enhanced anatomical reasoning. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Directory Structure |
|
|
|
|
|
```bash |
|
|
βββ Data Collection and Preprocessing/ |
|
|
β βββ Data_collection_Mimic.ipynb |
|
|
β βββ Data_preprocess.ipynb |
|
|
β βββ Radgraph Based Report Cleaning.ipynb |
|
|
β βββ train_data_json_gen.ipynb |
|
|
β |
|
|
βββ Evaluate/ |
|
|
β βββ Evaluate.ipynb |
|
|
β βββ Results_IU_Xray/ # Contains evaluation results on IU X-ray dataset |
|
|
β |
|
|
βββ llava_phi/ |
|
|
β βββ Dual Slava train.ipynb # Training pipeline |
|
|
β βββ generation.ipynb # Inference/report generation |
|
|
β |
|
|
βββ requirements.txt |
|
|
βββ README.md |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ Key Contributions |
|
|
|
|
|
- **Dual-Encoder Fusion**: Combines CLIP and BiomedCLIP for each view with learnable weight Ξ±: |
|
|
|
|
|
- **Cross-View Attention**: Enables anatomical reasoning across views: |
|
|
|
|
|
- **Gated Feature Fusion**: |
|
|
|
|
|
- **ReΒ³ Pipeline**: |
|
|
1. **Recognize**: Generate Findings from images |
|
|
2. **Reason**: Infer Impression from Findings |
|
|
3. **Report**: Output structured radiology reports |
|
|
|
|
|
--- |
|
|
|
|
|
## π Evaluation Metrics |
|
|
|
|
|
| Dataset | BLEU | ROUGE-L | METEOR | BERT | RadGraph F1 | CheXbert F1 | |
|
|
| --------- | ---- | ------- | ------ | ---- | ----------- | ----------- | |
|
|
| MIMIC-CXR | β
| β
| β
| β
| β
| β
| |
|
|
| IU X-Ray | β
| β
| β
| β
| β
| β
| |
|
|
|
|
|
_(Results in `/Evaluate/Results_IU_Xray`)_ |
|
|
|
|
|
--- |
|
|
|
|
|
## π οΈ Setup |
|
|
|
|
|
```bash |
|
|
# Clone repo |
|
|
git clone https://github.com/Clintonkjkj/Dual-View-Slava-CXR.git |
|
|
cd Dual-View-Slava-CXR |
|
|
|
|
|
# Set up virtual environment |
|
|
python -m venv venv |
|
|
source venv/bin/activate # or venv\Scripts\activate |
|
|
|
|
|
# Install dependencies |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π Usage |
|
|
|
|
|
### Download the model |
|
|
|
|
|
Huggingface - https://huggingface.co/CKJ26/Dual-View-Slava-Final |
|
|
|
|
|
### ποΈ Train the Model |
|
|
|
|
|
Use `llava_phi/Dual Slava train.ipynb` after preparing data using: |
|
|
|
|
|
- `Data_collection_Mimic.ipynb` |
|
|
- `Data_preprocess.ipynb` |
|
|
- `Radgraph Based Report Cleaning.ipynb` |
|
|
- `train_data_json_gen.ipynb` |
|
|
|
|
|
### π Generate Reports |
|
|
|
|
|
Use `llava_phi/generation.ipynb` with both frontal and lateral views, plus a prompt (e.g., "Generate a radiology report"). |
|
|
|
|
|
--- |
|
|
|
|
|
## πΌοΈ Model Architecture |
|
|
|
|
|
 |
|
|
|
|
|
--- |
|
|
|
|
|
## π Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{dualviewslava2025, |
|
|
title={Dual View SLaVA-CXR: Structured Radiology Reporting via Multi-View Chest X-rays}, |
|
|
author={Clinton KJ et al.}, |
|
|
year={2025}, |
|
|
note={Capstone Project} |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π§βπ» Author |
|
|
|
|
|
- **Clinton KJ** β [Hugging Face Profile](https://huggingface.co/CKJ26) |
|
|
|
|
|
--- |
|
|
|
|
|
## π License |
|
|
|
|
|
This repository is provided for academic research purposes only. |