Update README.md
Browse files
README.md
CHANGED
|
@@ -2,4 +2,176 @@
|
|
| 2 |
license: mit
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: mit
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
+
---
|
| 6 |
+
# MM-DLS
|
| 7 |
+
# Multi-task deep learning based on PET/CT images for the diagnosis and prognosis prediction of advanced non-small cell lung cancer
|
| 8 |
+
|
| 9 |
+
## Overview
|
| 10 |
+
|
| 11 |
+
**MM-DLS** is a multi-modal, multi-task deep learning framework for the diagnosis, staging, and prognosis prediction of advanced non-small cell lung cancer (NSCLC). It integrates multi-source data including CT images, PET metabolic parameters, and clinical information to provide a unified, non-invasive decision-making tool for personalized treatment planning.
|
| 12 |
+
|
| 13 |
+
This repository implements the full MM-DLS pipeline, consisting of:
|
| 14 |
+
- Lung-lesion segmentation with cross-attention transformer
|
| 15 |
+
- Multi-modal feature fusion (CT, PET, Clinical)
|
| 16 |
+
- Multi-task learning: Pathological classification, TNM staging, DFS and OS survival prediction
|
| 17 |
+
- Cox proportional hazards survival loss
|
| 18 |
+
|
| 19 |
+
The framework supports both classification (adenocarcinoma vs squamous cell carcinoma) and survival risk prediction tasks, and has been validated on large-scale multi-center clinical datasets.
|
| 20 |
+
|
| 21 |
+
---
|
| 22 |
+
|
| 23 |
+
## Key Features
|
| 24 |
+
|
| 25 |
+
- **Multi-modal fusion:** Combines CT-based imaging features, PET metabolic biomarkers (SUVmax, SUVmean, SUVpeak, TLG, MTV), and structured clinical variables (age, sex, smoking status, smoking duration, smoking cessation history, tumor size).
|
| 26 |
+
- **Multi-task learning:** Simultaneous optimization for:
|
| 27 |
+
- Histological subtype classification (LUAD vs LUSC)
|
| 28 |
+
- TNM stage classification (I-II, III, IV)
|
| 29 |
+
- Disease-free survival (DFS) prediction
|
| 30 |
+
- Overall survival (OS) prediction
|
| 31 |
+
- **Attention-based feature fusion:** Transformer cross-attention module to integrate lung-lesion spatial information.
|
| 32 |
+
- **Survival modeling:** Incorporates Cox Proportional Hazards loss for survival time prediction.
|
| 33 |
+
- **Flexible data simulation and loading:** Includes utilities for synthetic data generation and multi-slice 2D volume processing.
|
| 34 |
+
|
| 35 |
+
---
|
| 36 |
+
|
| 37 |
+
## Architecture
|
| 38 |
+
|
| 39 |
+
The overall MM-DLS system consists of:
|
| 40 |
+
|
| 41 |
+

|
| 42 |
+

|
| 43 |
+

|
| 44 |
+

|
| 45 |
+

|
| 46 |
+
|
| 47 |
+
1. **Segmentation Module (LungLesionSegmentor):**
|
| 48 |
+
- Shared ResNet encoder to extract features from CT images.
|
| 49 |
+
- Dual decoders for lung and lesion segmentation.
|
| 50 |
+
- Transformer-based cross-attention module for enhanced spatial feature interaction between lung and lesion regions.
|
| 51 |
+
|
| 52 |
+
2. **Feature Encoders:**
|
| 53 |
+
- `LesionEncoder`: 2D convolutional encoder for lesion patches.
|
| 54 |
+
- `SpaceEncoder`: 2D convolutional encoder for lung-space contextual patches.
|
| 55 |
+
|
| 56 |
+
3. **Attention Fusion Module:**
|
| 57 |
+
- `LesionAttentionFusion`: Multi-head attention to fuse lesion and lung features into compact patient-level representations.
|
| 58 |
+
|
| 59 |
+
4. **Patient-Level Fusion Model (PatientLevelFusionModel):**
|
| 60 |
+
- Fully connected network that combines imaging, PET, and clinical features.
|
| 61 |
+
- Outputs classification logits, DFS and OS risk scores.
|
| 62 |
+
|
| 63 |
+
5. **Loss Functions:**
|
| 64 |
+
- Binary cross-entropy loss for classification.
|
| 65 |
+
- Cox proportional hazards loss (`CoxPHLoss`) for survival prediction.
|
| 66 |
+
|
| 67 |
+
---
|
| 68 |
+
|
| 69 |
+
## Code Structure
|
| 70 |
+
|
| 71 |
+
- `ModelLesionEncoder.py`: Lesion image encoder extracting discriminative features from multi-slice tumor regions.
|
| 72 |
+
- `ModelSpaceEncoder.py`: Lung space encoder modeling anatomical and spatial context beyond the lesion.
|
| 73 |
+
- `LesionAttentionFusion.py`: Attention-based fusion module for adaptive integration of lesion and spatial features.
|
| 74 |
+
- `ClinicalFusionModel.py`: Patient-level fusion network combining imaging features, radiomics, PET signals, and clinical variables.
|
| 75 |
+
- `HierMM_DLS.py`:Core hierarchical multimodal deep learning model supporting multi-task learning: (1)Subtype classification; (2)TNM stage prediction; (3)DFS and OS modeling
|
| 76 |
+
- `CoxphLoss.py`: Cox proportional hazards loss for survival modeling with censored data.
|
| 77 |
+
- `PatientDataset.py`:Patient dataset loader supporting imaging, radiomics, PET, clinical variables, survival outcomes, and treatment labels.
|
| 78 |
+
- `LungLesionSegmentation.py`: Lung-lesion segmentation model
|
| 79 |
+
- `ImageDataLoader.py`: Image preprocessing and loading utilities for multi-slice inputs.
|
| 80 |
+
- `plot_results.py`: Visualization utilities for Kaplan–Meier curves, hazard ratios, and survival analysis results.
|
| 81 |
+
|
| 82 |
+
---
|
| 83 |
+
|
| 84 |
+
## Data Format
|
| 85 |
+
|
| 86 |
+
The input data is organized per patient as follows:
|
| 87 |
+
|
| 88 |
+
### Imaging Data:
|
| 89 |
+
- CT slices (PNG format)
|
| 90 |
+
- Lung masks (binary masks, PNG)
|
| 91 |
+
- Lesion masks (binary masks, PNG)
|
| 92 |
+
- Slices grouped per patient ID
|
| 93 |
+
|
| 94 |
+
### Tabular Data:
|
| 95 |
+
- Radiomics features: 128-dimensional vector (PyRadiomics extracted)
|
| 96 |
+
- PET features: [SUVmax, SUVmean, SUVpeak, TLG, MTV]
|
| 97 |
+
- Clinical features: [Age, Sex, Smoking Status, Smoking Duration, Smoking Cessation, Tumor Diameter]
|
| 98 |
+
- Survival data: DFS time/event, OS time/event
|
| 99 |
+
- Classification label: LUAD (0) or LUSC (1)
|
| 100 |
+
|
| 101 |
+
Simulated data utilities are provided for experimentation and reproducibility.
|
| 102 |
+
|
| 103 |
+
---
|
| 104 |
+
|
| 105 |
+
## Installation
|
| 106 |
+
|
| 107 |
+
```bash
|
| 108 |
+
# Clone repository
|
| 109 |
+
conda create -n mm_dls python=3.10 -y
|
| 110 |
+
conda activate mm_dls
|
| 111 |
+
git clone https://github.com/your_username/MM-DLS-NSCLC.git
|
| 112 |
+
```
|
| 113 |
+
## Install dependencies
|
| 114 |
+
```bash
|
| 115 |
+
pip install -r requirements.txt
|
| 116 |
+
```
|
| 117 |
+
## Usage
|
| 118 |
+
|
| 119 |
+
### 🔽 Download Pretrained Models
|
| 120 |
+
|
| 121 |
+
Pretrained MM-DLS models are available for direct download:
|
| 122 |
+
|
| 123 |
+
- **MM-DLS (Full multimodal, best checkpoint)**
|
| 124 |
+
[⬇️ Download Pretrained Model](https://drive.google.com/file/d/1IcyCwMgCX8wv0NMp84U4wlzhLoXH7ayx/view?usp=drive_link) Size 1.3 MB
|
| 125 |
+
The MM-DLS model is intentionally lightweight (~1.3 MB), as it employs compact CNN encoders and MLP-based multimodal fusion rather than large pretrained backbones, enabling efficient deployment and fast inference.
|
| 126 |
+
|
| 127 |
+
|
| 128 |
+
|
| 129 |
+
After downloading, place the model files under the `./MODEL/` directory:
|
| 130 |
+
|
| 131 |
+
Training:
|
| 132 |
+
```bash
|
| 133 |
+
python train_patient_model.py
|
| 134 |
+
```
|
| 135 |
+
Evaluation:
|
| 136 |
+
```bash
|
| 137 |
+
python test.py
|
| 138 |
+
```
|
| 139 |
+
Example Forward Pass:
|
| 140 |
+
```bash
|
| 141 |
+
python run_sample.ipynb
|
| 142 |
+
```
|
| 143 |
+
## Model Performance (from publication)
|
| 144 |
+
### Histological Subtype Classification:
|
| 145 |
+
|
| 146 |
+
AUC: 0.85 ~ 0.92 across cohorts
|
| 147 |
+
|
| 148 |
+
AP: 0.81 ~ 0.86
|
| 149 |
+
|
| 150 |
+
### TNM Stage Prediction:
|
| 151 |
+
|
| 152 |
+
AUC: Stage I-II (0.86 ~ 0.96), Stage III (0.85 ~ 0.95), Stage IV (0.83 ~ 0.95)
|
| 153 |
+
|
| 154 |
+
### AP and calibration maintained across internal and external sets
|
| 155 |
+
|
| 156 |
+
DFS & OS Prognosis:
|
| 157 |
+
|
| 158 |
+
C-index: up to 0.75
|
| 159 |
+
|
| 160 |
+
Time-dependent AUC (1/2/3 years): 0.77 ~ 0.91
|
| 161 |
+
|
| 162 |
+
Brier score: consistently < 0.2 for DFS and < 0.3 for OS
|
| 163 |
+
|
| 164 |
+
Superior to single modality models (clinical-only or imaging-only)
|
| 165 |
+
|
| 166 |
+
## Reference
|
| 167 |
+
Please cite our original publication when using this work:
|
| 168 |
+
|
| 169 |
+
License
|
| 170 |
+
This project is licensed under the MIT License.
|
| 171 |
+
|
| 172 |
+
⚠️ **Notice:** The pretrained model is shared solely for research validation purposes and **should not be used, distributed, or cited before the associated study is formally published**.
|
| 173 |
+
|
| 174 |
+
Contact
|
| 175 |
+
For any questions or collaborations, please contact:
|
| 176 |
+
|
| 177 |
+
Dr. Fang Dai: daifang_cool@163.com
|