FangDai commited on
Commit
bea5a4b
·
verified ·
1 Parent(s): c4a8353

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +173 -1
README.md CHANGED
@@ -2,4 +2,176 @@
2
  license: mit
3
  language:
4
  - en
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: mit
3
  language:
4
  - en
5
+ ---
6
+ # MM-DLS
7
+ # Multi-task deep learning based on PET/CT images for the diagnosis and prognosis prediction of advanced non-small cell lung cancer
8
+
9
+ ## Overview
10
+
11
+ **MM-DLS** is a multi-modal, multi-task deep learning framework for the diagnosis, staging, and prognosis prediction of advanced non-small cell lung cancer (NSCLC). It integrates multi-source data including CT images, PET metabolic parameters, and clinical information to provide a unified, non-invasive decision-making tool for personalized treatment planning.
12
+
13
+ This repository implements the full MM-DLS pipeline, consisting of:
14
+ - Lung-lesion segmentation with cross-attention transformer
15
+ - Multi-modal feature fusion (CT, PET, Clinical)
16
+ - Multi-task learning: Pathological classification, TNM staging, DFS and OS survival prediction
17
+ - Cox proportional hazards survival loss
18
+
19
+ The framework supports both classification (adenocarcinoma vs squamous cell carcinoma) and survival risk prediction tasks, and has been validated on large-scale multi-center clinical datasets.
20
+
21
+ ---
22
+
23
+ ## Key Features
24
+
25
+ - **Multi-modal fusion:** Combines CT-based imaging features, PET metabolic biomarkers (SUVmax, SUVmean, SUVpeak, TLG, MTV), and structured clinical variables (age, sex, smoking status, smoking duration, smoking cessation history, tumor size).
26
+ - **Multi-task learning:** Simultaneous optimization for:
27
+ - Histological subtype classification (LUAD vs LUSC)
28
+ - TNM stage classification (I-II, III, IV)
29
+ - Disease-free survival (DFS) prediction
30
+ - Overall survival (OS) prediction
31
+ - **Attention-based feature fusion:** Transformer cross-attention module to integrate lung-lesion spatial information.
32
+ - **Survival modeling:** Incorporates Cox Proportional Hazards loss for survival time prediction.
33
+ - **Flexible data simulation and loading:** Includes utilities for synthetic data generation and multi-slice 2D volume processing.
34
+
35
+ ---
36
+
37
+ ## Architecture
38
+
39
+ The overall MM-DLS system consists of:
40
+
41
+ ![Python](https://img.shields.io/badge/python-3.9%2B-blue)
42
+ ![PyTorch](https://img.shields.io/badge/PyTorch-2.x-red)
43
+ ![CUDA](https://img.shields.io/badge/CUDA-11.8%2B-green)
44
+ ![License](https://img.shields.io/badge/License-MIT-lightgrey)
45
+ ![Status](https://img.shields.io/badge/Status-Research-orange)
46
+
47
+ 1. **Segmentation Module (LungLesionSegmentor):**
48
+ - Shared ResNet encoder to extract features from CT images.
49
+ - Dual decoders for lung and lesion segmentation.
50
+ - Transformer-based cross-attention module for enhanced spatial feature interaction between lung and lesion regions.
51
+
52
+ 2. **Feature Encoders:**
53
+ - `LesionEncoder`: 2D convolutional encoder for lesion patches.
54
+ - `SpaceEncoder`: 2D convolutional encoder for lung-space contextual patches.
55
+
56
+ 3. **Attention Fusion Module:**
57
+ - `LesionAttentionFusion`: Multi-head attention to fuse lesion and lung features into compact patient-level representations.
58
+
59
+ 4. **Patient-Level Fusion Model (PatientLevelFusionModel):**
60
+ - Fully connected network that combines imaging, PET, and clinical features.
61
+ - Outputs classification logits, DFS and OS risk scores.
62
+
63
+ 5. **Loss Functions:**
64
+ - Binary cross-entropy loss for classification.
65
+ - Cox proportional hazards loss (`CoxPHLoss`) for survival prediction.
66
+
67
+ ---
68
+
69
+ ## Code Structure
70
+
71
+ - `ModelLesionEncoder.py`: Lesion image encoder extracting discriminative features from multi-slice tumor regions.
72
+ - `ModelSpaceEncoder.py`: Lung space encoder modeling anatomical and spatial context beyond the lesion.
73
+ - `LesionAttentionFusion.py`: Attention-based fusion module for adaptive integration of lesion and spatial features.
74
+ - `ClinicalFusionModel.py`: Patient-level fusion network combining imaging features, radiomics, PET signals, and clinical variables.
75
+ - `HierMM_DLS.py`:Core hierarchical multimodal deep learning model supporting multi-task learning: (1)Subtype classification; (2)TNM stage prediction; (3)DFS and OS modeling
76
+ - `CoxphLoss.py`: Cox proportional hazards loss for survival modeling with censored data.
77
+ - `PatientDataset.py`:Patient dataset loader supporting imaging, radiomics, PET, clinical variables, survival outcomes, and treatment labels.
78
+ - `LungLesionSegmentation.py`: Lung-lesion segmentation model
79
+ - `ImageDataLoader.py`: Image preprocessing and loading utilities for multi-slice inputs.
80
+ - `plot_results.py`: Visualization utilities for Kaplan–Meier curves, hazard ratios, and survival analysis results.
81
+
82
+ ---
83
+
84
+ ## Data Format
85
+
86
+ The input data is organized per patient as follows:
87
+
88
+ ### Imaging Data:
89
+ - CT slices (PNG format)
90
+ - Lung masks (binary masks, PNG)
91
+ - Lesion masks (binary masks, PNG)
92
+ - Slices grouped per patient ID
93
+
94
+ ### Tabular Data:
95
+ - Radiomics features: 128-dimensional vector (PyRadiomics extracted)
96
+ - PET features: [SUVmax, SUVmean, SUVpeak, TLG, MTV]
97
+ - Clinical features: [Age, Sex, Smoking Status, Smoking Duration, Smoking Cessation, Tumor Diameter]
98
+ - Survival data: DFS time/event, OS time/event
99
+ - Classification label: LUAD (0) or LUSC (1)
100
+
101
+ Simulated data utilities are provided for experimentation and reproducibility.
102
+
103
+ ---
104
+
105
+ ## Installation
106
+
107
+ ```bash
108
+ # Clone repository
109
+ conda create -n mm_dls python=3.10 -y
110
+ conda activate mm_dls
111
+ git clone https://github.com/your_username/MM-DLS-NSCLC.git
112
+ ```
113
+ ## Install dependencies
114
+ ```bash
115
+ pip install -r requirements.txt
116
+ ```
117
+ ## Usage
118
+
119
+ ### 🔽 Download Pretrained Models
120
+
121
+ Pretrained MM-DLS models are available for direct download:
122
+
123
+ - **MM-DLS (Full multimodal, best checkpoint)**
124
+ [⬇️ Download Pretrained Model](https://drive.google.com/file/d/1IcyCwMgCX8wv0NMp84U4wlzhLoXH7ayx/view?usp=drive_link) Size 1.3 MB
125
+ The MM-DLS model is intentionally lightweight (~1.3 MB), as it employs compact CNN encoders and MLP-based multimodal fusion rather than large pretrained backbones, enabling efficient deployment and fast inference.
126
+
127
+
128
+
129
+ After downloading, place the model files under the `./MODEL/` directory:
130
+
131
+ Training:
132
+ ```bash
133
+ python train_patient_model.py
134
+ ```
135
+ Evaluation:
136
+ ```bash
137
+ python test.py
138
+ ```
139
+ Example Forward Pass:
140
+ ```bash
141
+ python run_sample.ipynb
142
+ ```
143
+ ## Model Performance (from publication)
144
+ ### Histological Subtype Classification:
145
+
146
+ AUC: 0.85 ~ 0.92 across cohorts
147
+
148
+ AP: 0.81 ~ 0.86
149
+
150
+ ### TNM Stage Prediction:
151
+
152
+ AUC: Stage I-II (0.86 ~ 0.96), Stage III (0.85 ~ 0.95), Stage IV (0.83 ~ 0.95)
153
+
154
+ ### AP and calibration maintained across internal and external sets
155
+
156
+ DFS & OS Prognosis:
157
+
158
+ C-index: up to 0.75
159
+
160
+ Time-dependent AUC (1/2/3 years): 0.77 ~ 0.91
161
+
162
+ Brier score: consistently < 0.2 for DFS and < 0.3 for OS
163
+
164
+ Superior to single modality models (clinical-only or imaging-only)
165
+
166
+ ## Reference
167
+ Please cite our original publication when using this work:
168
+
169
+ License
170
+ This project is licensed under the MIT License.
171
+
172
+ ⚠️ **Notice:** The pretrained model is shared solely for research validation purposes and **should not be used, distributed, or cited before the associated study is formally published**.
173
+
174
+ Contact
175
+ For any questions or collaborations, please contact:
176
+
177
+ Dr. Fang Dai: daifang_cool@163.com