Multi-Modal-Medical-Analysis-System

Sleeping

App Files Files Community

Amarthya7 commited on Mar 11, 2025

Commit

5c4804a

verified ·

1 Parent(s): ba205bb

Update README.md

Browse files

Files changed (1) hide show

README.md +362 -64

README.md CHANGED Viewed

@@ -1,64 +1,362 @@
-# MediSync: Multi-Modal Medical Analysis System
-MediSync is an AI-powered healthcare solution that combines X-ray image analysis with patient report text processing to provide comprehensive medical insights.
-## Features
-- **X-ray Image Analysis**: Detects abnormalities in chest X-rays using pre-trained vision models from Hugging Face.
-- **Medical Report Processing**: Extracts key information from patient reports using NLP models.
-- **Multi-modal Integration**: Combines insights from both image and text data for more accurate diagnosis suggestions.
-- **User-friendly Interface**: Simple web interface for uploading images and reports.
-## Project Structure
-```
-mediSync/
-├── app.py                    # Main application with Gradio interface
-├── models/
-│   ├── image_analyzer.py     # X-ray image analysis module
-│   ├── text_analyzer.py      # Medical report text analysis module
-│   └── multimodal_fusion.py  # Fusion of image and text insights
-├── utils/
-│   ├── preprocessing.py      # Data preprocessing utilities
-│   └── visualization.py      # Result visualization utilities
-├── data/
-│   └── sample/               # Sample data for testing
-└── tests/                    # Unit tests
-```
-## Setup Instructions
-1. Clone this repository:
-```bash
-git clone [repository-url]
-cd MediSync
-```
-2. Install dependencies:
-```bash
-pip install -r requirements.txt
-```
-3. Run the application:
-```bash
-python app.py
-```
-4. Access the web interface at `http://localhost:7860`
-## Models Used
-- **X-ray Analysis**: facebook/deit-base-patch16-224-medical-cxr
-- **Medical Text Analysis**: medicalai/ClinicalBERT
-- **Additional Support Models**: Medical question answering and entity recognition models
-## Use Cases
-- Preliminary screening of chest X-rays
-- Cross-validation of radiologist reports
-- Educational tool for medical students
-- Research tool for studying correlation between visual findings and written reports
-## Note
-This system is designed as a support tool and should not replace professional medical diagnosis. Always consult with healthcare professionals for medical decisions.

+# MediSync: Multi-Modal Medical Analysis System
+MediSync is an AI-powered healthcare solution that combines X-ray image analysis with patient report text processing to provide comprehensive medical insights.
+## Features
+- **X-ray Image Analysis**: Detects abnormalities in chest X-rays using pre-trained vision models from Hugging Face.
+- **Medical Report Processing**: Extracts key information from patient reports using NLP models.
+- **Multi-modal Integration**: Combines insights from both image and text data for more accurate diagnosis suggestions.
+- **User-friendly Interface**: Simple web interface for uploading images and reports.
+## Project Structure
+```
+mediSync/
+├── app.py                    # Main application with Gradio interface
+├── models/
+│   ├── image_analyzer.py     # X-ray image analysis module
+│   ├── text_analyzer.py      # Medical report text analysis module
+│   └── multimodal_fusion.py  # Fusion of image and text insights
+├── utils/
+│   ├── preprocessing.py      # Data preprocessing utilities
+│   └── visualization.py      # Result visualization utilities
+├── data/
+│   └── sample/               # Sample data for testing
+└── tests/                    # Unit tests
+```
+# MediSync: Multi-Modal Medical Analysis System
+## Comprehensive Technical Documentation
+### Table of Contents
+1. [Introduction](#introduction)
+2. [System Architecture](#system-architecture)
+3. [Installation](#installation)
+4. [Usage](#usage)
+5. [Core Components](#core-components)
+6. [Model Details](#model-details)
+7. [API Reference](#api-reference)
+8. [Extending the System](#extending-the-system)
+9. [Troubleshooting](#troubleshooting)
+10. [References](#references)
+---
+## Introduction
+MediSync is a multi-modal AI system that combines X-ray image analysis with medical report text processing to provide comprehensive medical insights. By leveraging state-of-the-art deep learning models for both vision and language understanding, MediSync can:
+- Analyze chest X-ray images to detect abnormalities
+- Extract key clinical information from medical reports
+- Fuse insights from both modalities for enhanced diagnosis support
+- Provide comprehensive visualization of analysis results
+This AI system demonstrates the power of multi-modal fusion in the healthcare domain, where integrating information from multiple sources can lead to more robust and accurate analyses.
+## System Architecture
+MediSync follows a modular architecture with three main components:
+1. **Image Analysis Module**: Processes X-ray images using pre-trained vision models
+2. **Text Analysis Module**: Analyzes medical reports using NLP models
+3. **Multimodal Fusion Module**: Combines insights from both modalities
+The system uses the following high-level workflow:
+```
+                      ┌─────────────────┐
+                      │    X-ray Image  │
+                      └────────┬────────┘
+                               │
+                               ▼
+┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
+│  Preprocessing  │───▶│  Image Analysis │───▶│                 │
+└─────────────────┘    └─────────────────┘    │                 │
+                                               │   Multimodal    │
+┌─────────────────┐    ┌─────────────────┐    │     Fusion      │───▶ Results
+│ Medical Report  │───▶│  Text Analysis  │───▶│                 │
+└─────────────────┘    └─────────────────┘    │                 │
+                                               └─────────────────┘
+```
+## Installation
+### Prerequisites
+- Python 3.8 or higher
+- Pip package manager
+### Setup Instructions
+1. Clone the repository:
+```bash
+git clone [repository-url]
+cd mediSync
+```
+2. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+3. Download sample data:
+```bash
+python -m mediSync.utils.download_samples
+```
+## Usage
+### Running the Application
+To launch the MediSync application with the Gradio interface:
+```bash
+python run.py
+```
+This will:
+1. Download sample data if not already present
+2. Initialize the application
+3. Launch the Gradio web interface
+### Web Interface
+MediSync provides a user-friendly web interface with three main tabs:
+1. **Multimodal Analysis**: Upload an X-ray image and enter a medical report for combined analysis
+2. **Image Analysis**: Upload an X-ray image for image-only analysis
+3. **Text Analysis**: Enter a medical report for text-only analysis
+### Command Line Usage
+You can also use the core components directly from Python:
+```python
+from mediSync.models import XRayImageAnalyzer, MedicalReportAnalyzer, MultimodalFusion
+# Initialize models
+fusion_model = MultimodalFusion()
+# Analyze image and text
+results = fusion_model.analyze("path/to/image.jpg", "Medical report text...")
+# Get explanation
+explanation = fusion_model.get_explanation(results)
+print(explanation)
+```
+## Core Components
+### Image Analysis Module
+The `XRayImageAnalyzer` class is responsible for analyzing X-ray images:
+- Uses the DeiT (Data-efficient image Transformers) model fine-tuned on chest X-rays
+- Detects abnormalities and classifies findings
+- Provides confidence scores and primary findings
+Key methods:
+- `analyze(image_path)`: Analyzes an X-ray image
+- `get_explanation(results)`: Generates a human-readable explanation
+### Text Analysis Module
+The `MedicalReportAnalyzer` class processes medical report text:
+- Extracts medical entities (conditions, treatments, tests)
+- Assesses severity level
+- Extracts key findings
+- Suggests follow-up actions
+Key methods:
+- `extract_entities(text)`: Extracts medical entities
+- `assess_severity(text)`: Determines severity level
+- `extract_findings(text)`: Extracts key clinical findings
+- `suggest_followup(text, entities, severity)`: Suggests follow-up actions
+- `analyze(text)`: Performs comprehensive analysis
+### Multimodal Fusion Module
+The `MultimodalFusion` class combines insights from both modalities:
+- Calculates agreement between image and text analyses
+- Determines confidence-weighted findings
+- Provides comprehensive severity assessment
+- Merges follow-up recommendations
+Key methods:
+- `analyze_image(image_path)`: Analyzes image only
+- `analyze_text(text)`: Analyzes text only
+- `analyze(image_path, report_text)`: Performs multimodal analysis
+- `get_explanation(fused_results)`: Generates comprehensive explanation
+## Model Details
+### X-ray Analysis Model
+- **Model**: facebook/deit-base-patch16-224-medical-cxr
+- **Architecture**: Data-efficient image Transformer (DeiT)
+- **Training Data**: Chest X-ray datasets
+- **Input Size**: 224x224 pixels
+- **Output**: Classification probabilities for various conditions
+### Medical Text Analysis Models
+- **Entity Recognition Model**: samrawal/bert-base-uncased_medical-ner
+- **Classification Model**: medicalai/ClinicalBERT
+- **Architecture**: BERT-based transformer models
+- **Training Data**: Medical text and reports
+## API Reference
+### XRayImageAnalyzer
+```python
+from mediSync.models import XRayImageAnalyzer
+# Initialize
+analyzer = XRayImageAnalyzer(model_name="facebook/deit-base-patch16-224-medical-cxr")
+# Analyze image
+results = analyzer.analyze("path/to/image.jpg")
+# Get explanation
+explanation = analyzer.get_explanation(results)
+```
+### MedicalReportAnalyzer
+```python
+from mediSync.models import MedicalReportAnalyzer
+# Initialize
+analyzer = MedicalReportAnalyzer()
+# Analyze report
+results = analyzer.analyze("Medical report text...")
+# Access specific components
+entities = results["entities"]
+severity = results["severity"]
+findings = results["findings"]
+recommendations = results["followup_recommendations"]
+```
+### MultimodalFusion
+```python
+from mediSync.models import MultimodalFusion
+# Initialize
+fusion = MultimodalFusion()
+# Multimodal analysis
+results = fusion.analyze("path/to/image.jpg", "Medical report text...")
+# Get explanation
+explanation = fusion.get_explanation(results)
+```
+## Extending the System
+### Adding New Models
+To add a new image analysis model:
+1. Create a new class that follows the same interface as `XRayImageAnalyzer`
+2. Update the `MultimodalFusion` class to use your new model
+```python
+class NewXRayModel:
+    def __init__(self, model_name, device=None):
+        # Initialize your model
+        pass
+    def analyze(self, image_path):
+        # Implement analysis logic
+        return results
+    def get_explanation(self, results):
+        # Generate explanation
+        return explanation
+```
+### Custom Preprocessing
+You can extend the preprocessing utilities in `utils/preprocessing.py` for custom data preparation:
+```python
+def my_custom_preprocessor(image_path, **kwargs):
+    # Implement custom preprocessing
+    return processed_image
+```
+### Visualization Extensions
+To add new visualization options, extend the utilities in `utils/visualization.py`:
+```python
+def my_custom_visualization(results, **kwargs):
+    # Create custom visualization
+    return figure
+```
+## Troubleshooting
+### Common Issues
+1. **Model Loading Errors**
+   - Ensure you have a stable internet connection for downloading models
+   - Check that you have sufficient disk space
+   - Try specifying a different model checkpoint
+2. **Image Processing Errors**
+   - Ensure images are in a supported format (JPEG, PNG)
+   - Check that the image is a valid X-ray image
+   - Try preprocessing the image manually using the utility functions
+3. **Performance Issues**
+   - For faster inference, use a GPU if available
+   - Reduce image resolution if processing is too slow
+   - Use the text-only analysis for quicker results
+### Logging
+MediSync uses Python's logging module for debug information:
+```python
+import logging
+logging.basicConfig(level=logging.DEBUG)
+```
+Log files are saved to `mediSync.log` in the application directory.
+## References
+### Datasets
+- [MIMIC-CXR](https://physionet.org/content/mimic-cxr/2.0.0/): Large dataset of chest radiographs with reports
+- [ChestX-ray14](https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community): NIH dataset of chest X-rays
+### Papers
+- He, K., et al. (2020). "Vision Transformers for Medical Image Analysis"
+- Irvin, J., et al. (2019). "CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison"
+- Johnson, A.E.W., et al. (2019). "MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs"
+### Tools and Libraries
+- [Hugging Face Transformers](https://huggingface.co/docs/transformers/index)
+- [PyTorch](https://pytorch.org/)
+- [Gradio](https://gradio.app/)
+---
+## License
+This project is licensed under the MIT License - see the LICENSE file for details.
+## Acknowledgments
+- The development of MediSync was inspired by recent advances in multi-modal learning in healthcare.
+- Special thanks to the open-source community for providing pre-trained models and tools.