Amarthya7 commited on
Commit
5c4804a
Β·
verified Β·
1 Parent(s): ba205bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +362 -64
README.md CHANGED
@@ -1,64 +1,362 @@
1
- # MediSync: Multi-Modal Medical Analysis System
2
-
3
- MediSync is an AI-powered healthcare solution that combines X-ray image analysis with patient report text processing to provide comprehensive medical insights.
4
-
5
- ## Features
6
-
7
- - **X-ray Image Analysis**: Detects abnormalities in chest X-rays using pre-trained vision models from Hugging Face.
8
- - **Medical Report Processing**: Extracts key information from patient reports using NLP models.
9
- - **Multi-modal Integration**: Combines insights from both image and text data for more accurate diagnosis suggestions.
10
- - **User-friendly Interface**: Simple web interface for uploading images and reports.
11
-
12
- ## Project Structure
13
-
14
- ```
15
- mediSync/
16
- β”œβ”€β”€ app.py # Main application with Gradio interface
17
- β”œβ”€β”€ models/
18
- β”‚ β”œβ”€β”€ image_analyzer.py # X-ray image analysis module
19
- β”‚ β”œβ”€β”€ text_analyzer.py # Medical report text analysis module
20
- β”‚ └── multimodal_fusion.py # Fusion of image and text insights
21
- β”œβ”€β”€ utils/
22
- β”‚ β”œβ”€β”€ preprocessing.py # Data preprocessing utilities
23
- β”‚ └── visualization.py # Result visualization utilities
24
- β”œβ”€β”€ data/
25
- β”‚ └── sample/ # Sample data for testing
26
- └── tests/ # Unit tests
27
- ```
28
-
29
- ## Setup Instructions
30
-
31
- 1. Clone this repository:
32
- ```bash
33
- git clone [repository-url]
34
- cd MediSync
35
- ```
36
-
37
- 2. Install dependencies:
38
- ```bash
39
- pip install -r requirements.txt
40
- ```
41
-
42
- 3. Run the application:
43
- ```bash
44
- python app.py
45
- ```
46
-
47
- 4. Access the web interface at `http://localhost:7860`
48
-
49
- ## Models Used
50
-
51
- - **X-ray Analysis**: facebook/deit-base-patch16-224-medical-cxr
52
- - **Medical Text Analysis**: medicalai/ClinicalBERT
53
- - **Additional Support Models**: Medical question answering and entity recognition models
54
-
55
- ## Use Cases
56
-
57
- - Preliminary screening of chest X-rays
58
- - Cross-validation of radiologist reports
59
- - Educational tool for medical students
60
- - Research tool for studying correlation between visual findings and written reports
61
-
62
- ## Note
63
-
64
- This system is designed as a support tool and should not replace professional medical diagnosis. Always consult with healthcare professionals for medical decisions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MediSync: Multi-Modal Medical Analysis System
2
+
3
+ MediSync is an AI-powered healthcare solution that combines X-ray image analysis with patient report text processing to provide comprehensive medical insights.
4
+
5
+ ## Features
6
+
7
+ - **X-ray Image Analysis**: Detects abnormalities in chest X-rays using pre-trained vision models from Hugging Face.
8
+ - **Medical Report Processing**: Extracts key information from patient reports using NLP models.
9
+ - **Multi-modal Integration**: Combines insights from both image and text data for more accurate diagnosis suggestions.
10
+ - **User-friendly Interface**: Simple web interface for uploading images and reports.
11
+
12
+ ## Project Structure
13
+
14
+ ```
15
+ mediSync/
16
+ β”œβ”€β”€ app.py # Main application with Gradio interface
17
+ β”œβ”€β”€ models/
18
+ β”‚ β”œβ”€β”€ image_analyzer.py # X-ray image analysis module
19
+ β”‚ β”œβ”€β”€ text_analyzer.py # Medical report text analysis module
20
+ β”‚ └── multimodal_fusion.py # Fusion of image and text insights
21
+ β”œβ”€β”€ utils/
22
+ β”‚ β”œβ”€β”€ preprocessing.py # Data preprocessing utilities
23
+ β”‚ └── visualization.py # Result visualization utilities
24
+ β”œβ”€β”€ data/
25
+ β”‚ └── sample/ # Sample data for testing
26
+ └── tests/ # Unit tests
27
+ ```
28
+ # MediSync: Multi-Modal Medical Analysis System
29
+
30
+ ## Comprehensive Technical Documentation
31
+
32
+ ### Table of Contents
33
+ 1. [Introduction](#introduction)
34
+ 2. [System Architecture](#system-architecture)
35
+ 3. [Installation](#installation)
36
+ 4. [Usage](#usage)
37
+ 5. [Core Components](#core-components)
38
+ 6. [Model Details](#model-details)
39
+ 7. [API Reference](#api-reference)
40
+ 8. [Extending the System](#extending-the-system)
41
+ 9. [Troubleshooting](#troubleshooting)
42
+ 10. [References](#references)
43
+
44
+ ---
45
+
46
+ ## Introduction
47
+
48
+ MediSync is a multi-modal AI system that combines X-ray image analysis with medical report text processing to provide comprehensive medical insights. By leveraging state-of-the-art deep learning models for both vision and language understanding, MediSync can:
49
+
50
+ - Analyze chest X-ray images to detect abnormalities
51
+ - Extract key clinical information from medical reports
52
+ - Fuse insights from both modalities for enhanced diagnosis support
53
+ - Provide comprehensive visualization of analysis results
54
+
55
+ This AI system demonstrates the power of multi-modal fusion in the healthcare domain, where integrating information from multiple sources can lead to more robust and accurate analyses.
56
+
57
+ ## System Architecture
58
+
59
+ MediSync follows a modular architecture with three main components:
60
+
61
+ 1. **Image Analysis Module**: Processes X-ray images using pre-trained vision models
62
+ 2. **Text Analysis Module**: Analyzes medical reports using NLP models
63
+ 3. **Multimodal Fusion Module**: Combines insights from both modalities
64
+
65
+ The system uses the following high-level workflow:
66
+
67
+ ```
68
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
69
+ β”‚ X-ray Image β”‚
70
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
71
+ β”‚
72
+ β–Ό
73
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
74
+ β”‚ Preprocessing │───▢│ Image Analysis │───▢│ β”‚
75
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
76
+ β”‚ Multimodal β”‚
77
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Fusion │───▢ Results
78
+ β”‚ Medical Report │───▢│ Text Analysis │───▢│ β”‚
79
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
80
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
81
+ ```
82
+
83
+ ## Installation
84
+
85
+ ### Prerequisites
86
+ - Python 3.8 or higher
87
+ - Pip package manager
88
+
89
+ ### Setup Instructions
90
+
91
+ 1. Clone the repository:
92
+ ```bash
93
+ git clone [repository-url]
94
+ cd mediSync
95
+ ```
96
+
97
+ 2. Install dependencies:
98
+ ```bash
99
+ pip install -r requirements.txt
100
+ ```
101
+
102
+ 3. Download sample data:
103
+ ```bash
104
+ python -m mediSync.utils.download_samples
105
+ ```
106
+
107
+ ## Usage
108
+
109
+ ### Running the Application
110
+
111
+ To launch the MediSync application with the Gradio interface:
112
+
113
+ ```bash
114
+ python run.py
115
+ ```
116
+
117
+ This will:
118
+ 1. Download sample data if not already present
119
+ 2. Initialize the application
120
+ 3. Launch the Gradio web interface
121
+
122
+ ### Web Interface
123
+
124
+ MediSync provides a user-friendly web interface with three main tabs:
125
+
126
+ 1. **Multimodal Analysis**: Upload an X-ray image and enter a medical report for combined analysis
127
+ 2. **Image Analysis**: Upload an X-ray image for image-only analysis
128
+ 3. **Text Analysis**: Enter a medical report for text-only analysis
129
+
130
+ ### Command Line Usage
131
+
132
+ You can also use the core components directly from Python:
133
+
134
+ ```python
135
+ from mediSync.models import XRayImageAnalyzer, MedicalReportAnalyzer, MultimodalFusion
136
+
137
+ # Initialize models
138
+ fusion_model = MultimodalFusion()
139
+
140
+ # Analyze image and text
141
+ results = fusion_model.analyze("path/to/image.jpg", "Medical report text...")
142
+
143
+ # Get explanation
144
+ explanation = fusion_model.get_explanation(results)
145
+ print(explanation)
146
+ ```
147
+
148
+ ## Core Components
149
+
150
+ ### Image Analysis Module
151
+
152
+ The `XRayImageAnalyzer` class is responsible for analyzing X-ray images:
153
+
154
+ - Uses the DeiT (Data-efficient image Transformers) model fine-tuned on chest X-rays
155
+ - Detects abnormalities and classifies findings
156
+ - Provides confidence scores and primary findings
157
+
158
+ Key methods:
159
+ - `analyze(image_path)`: Analyzes an X-ray image
160
+ - `get_explanation(results)`: Generates a human-readable explanation
161
+
162
+ ### Text Analysis Module
163
+
164
+ The `MedicalReportAnalyzer` class processes medical report text:
165
+
166
+ - Extracts medical entities (conditions, treatments, tests)
167
+ - Assesses severity level
168
+ - Extracts key findings
169
+ - Suggests follow-up actions
170
+
171
+ Key methods:
172
+ - `extract_entities(text)`: Extracts medical entities
173
+ - `assess_severity(text)`: Determines severity level
174
+ - `extract_findings(text)`: Extracts key clinical findings
175
+ - `suggest_followup(text, entities, severity)`: Suggests follow-up actions
176
+ - `analyze(text)`: Performs comprehensive analysis
177
+
178
+ ### Multimodal Fusion Module
179
+
180
+ The `MultimodalFusion` class combines insights from both modalities:
181
+
182
+ - Calculates agreement between image and text analyses
183
+ - Determines confidence-weighted findings
184
+ - Provides comprehensive severity assessment
185
+ - Merges follow-up recommendations
186
+
187
+ Key methods:
188
+ - `analyze_image(image_path)`: Analyzes image only
189
+ - `analyze_text(text)`: Analyzes text only
190
+ - `analyze(image_path, report_text)`: Performs multimodal analysis
191
+ - `get_explanation(fused_results)`: Generates comprehensive explanation
192
+
193
+ ## Model Details
194
+
195
+ ### X-ray Analysis Model
196
+
197
+ - **Model**: facebook/deit-base-patch16-224-medical-cxr
198
+ - **Architecture**: Data-efficient image Transformer (DeiT)
199
+ - **Training Data**: Chest X-ray datasets
200
+ - **Input Size**: 224x224 pixels
201
+ - **Output**: Classification probabilities for various conditions
202
+
203
+ ### Medical Text Analysis Models
204
+
205
+ - **Entity Recognition Model**: samrawal/bert-base-uncased_medical-ner
206
+ - **Classification Model**: medicalai/ClinicalBERT
207
+ - **Architecture**: BERT-based transformer models
208
+ - **Training Data**: Medical text and reports
209
+
210
+ ## API Reference
211
+
212
+ ### XRayImageAnalyzer
213
+
214
+ ```python
215
+ from mediSync.models import XRayImageAnalyzer
216
+
217
+ # Initialize
218
+ analyzer = XRayImageAnalyzer(model_name="facebook/deit-base-patch16-224-medical-cxr")
219
+
220
+ # Analyze image
221
+ results = analyzer.analyze("path/to/image.jpg")
222
+
223
+ # Get explanation
224
+ explanation = analyzer.get_explanation(results)
225
+ ```
226
+
227
+ ### MedicalReportAnalyzer
228
+
229
+ ```python
230
+ from mediSync.models import MedicalReportAnalyzer
231
+
232
+ # Initialize
233
+ analyzer = MedicalReportAnalyzer()
234
+
235
+ # Analyze report
236
+ results = analyzer.analyze("Medical report text...")
237
+
238
+ # Access specific components
239
+ entities = results["entities"]
240
+ severity = results["severity"]
241
+ findings = results["findings"]
242
+ recommendations = results["followup_recommendations"]
243
+ ```
244
+
245
+ ### MultimodalFusion
246
+
247
+ ```python
248
+ from mediSync.models import MultimodalFusion
249
+
250
+ # Initialize
251
+ fusion = MultimodalFusion()
252
+
253
+ # Multimodal analysis
254
+ results = fusion.analyze("path/to/image.jpg", "Medical report text...")
255
+
256
+ # Get explanation
257
+ explanation = fusion.get_explanation(results)
258
+ ```
259
+
260
+ ## Extending the System
261
+
262
+ ### Adding New Models
263
+
264
+ To add a new image analysis model:
265
+
266
+ 1. Create a new class that follows the same interface as `XRayImageAnalyzer`
267
+ 2. Update the `MultimodalFusion` class to use your new model
268
+
269
+ ```python
270
+ class NewXRayModel:
271
+ def __init__(self, model_name, device=None):
272
+ # Initialize your model
273
+ pass
274
+
275
+ def analyze(self, image_path):
276
+ # Implement analysis logic
277
+ return results
278
+
279
+ def get_explanation(self, results):
280
+ # Generate explanation
281
+ return explanation
282
+ ```
283
+
284
+ ### Custom Preprocessing
285
+
286
+ You can extend the preprocessing utilities in `utils/preprocessing.py` for custom data preparation:
287
+
288
+ ```python
289
+ def my_custom_preprocessor(image_path, **kwargs):
290
+ # Implement custom preprocessing
291
+ return processed_image
292
+ ```
293
+
294
+ ### Visualization Extensions
295
+
296
+ To add new visualization options, extend the utilities in `utils/visualization.py`:
297
+
298
+ ```python
299
+ def my_custom_visualization(results, **kwargs):
300
+ # Create custom visualization
301
+ return figure
302
+ ```
303
+
304
+ ## Troubleshooting
305
+
306
+ ### Common Issues
307
+
308
+ 1. **Model Loading Errors**
309
+ - Ensure you have a stable internet connection for downloading models
310
+ - Check that you have sufficient disk space
311
+ - Try specifying a different model checkpoint
312
+
313
+ 2. **Image Processing Errors**
314
+ - Ensure images are in a supported format (JPEG, PNG)
315
+ - Check that the image is a valid X-ray image
316
+ - Try preprocessing the image manually using the utility functions
317
+
318
+ 3. **Performance Issues**
319
+ - For faster inference, use a GPU if available
320
+ - Reduce image resolution if processing is too slow
321
+ - Use the text-only analysis for quicker results
322
+
323
+ ### Logging
324
+
325
+ MediSync uses Python's logging module for debug information:
326
+
327
+ ```python
328
+ import logging
329
+ logging.basicConfig(level=logging.DEBUG)
330
+ ```
331
+
332
+ Log files are saved to `mediSync.log` in the application directory.
333
+
334
+ ## References
335
+
336
+ ### Datasets
337
+
338
+ - [MIMIC-CXR](https://physionet.org/content/mimic-cxr/2.0.0/): Large dataset of chest radiographs with reports
339
+ - [ChestX-ray14](https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community): NIH dataset of chest X-rays
340
+
341
+ ### Papers
342
+
343
+ - He, K., et al. (2020). "Vision Transformers for Medical Image Analysis"
344
+ - Irvin, J., et al. (2019). "CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison"
345
+ - Johnson, A.E.W., et al. (2019). "MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs"
346
+
347
+ ### Tools and Libraries
348
+
349
+ - [Hugging Face Transformers](https://huggingface.co/docs/transformers/index)
350
+ - [PyTorch](https://pytorch.org/)
351
+ - [Gradio](https://gradio.app/)
352
+
353
+ ---
354
+
355
+ ## License
356
+
357
+ This project is licensed under the MIT License - see the LICENSE file for details.
358
+
359
+ ## Acknowledgments
360
+
361
+ - The development of MediSync was inspired by recent advances in multi-modal learning in healthcare.
362
+ - Special thanks to the open-source community for providing pre-trained models and tools.