Spaces:
Sleeping
Sleeping
Fix README config and remove merge conflict markers
Browse files
README.md
CHANGED
|
@@ -1,4 +1,3 @@
|
|
| 1 |
-
<<<<<<< HEAD
|
| 2 |
---
|
| 3 |
title: MediSim
|
| 4 |
emoji: "🩺"
|
|
@@ -11,192 +10,84 @@ pinned: false
|
|
| 11 |
|
| 12 |
# MediSim: Multimodal Diagnostic and Agentic Triage System
|
| 13 |
|
| 14 |
-
MediSim is an AI-powered medical assistant web application designed to safely process health inputs. It
|
| 15 |
-
=======
|
| 16 |
-
# MediSim: Multimodal Diagnostic and Agentic Triage System
|
| 17 |
-
|
| 18 |
-
**MediSim** is an AI-powered medical assistant web application designed to safely process complex health inputs. It serves as our core NLP research project, specifically targeting the reduction of clinical hallucination in generative healthcare applications using hybrid learning pipelines.
|
| 19 |
-
>>>>>>> origin/main
|
| 20 |
|
| 21 |
## Core Features
|
| 22 |
|
| 23 |
-
MediSim offers two distinct standalone features addressing different triage and diagnostic modalities.
|
| 24 |
-
|
| 25 |
-
<<<<<<< HEAD
|
| 26 |
### 1. Multimodal Diagnostic Assistant
|
| 27 |
|
| 28 |
-
-
|
| 29 |
-
-
|
| 30 |
-
-
|
| 31 |
-
-
|
| 32 |
-
-
|
| 33 |
-
-
|
| 34 |
-
-
|
| 35 |
-
|
| 36 |
-
### 2. Agentic Triage & Consultation
|
| 37 |
|
| 38 |
-
|
| 39 |
-
- **Processing**: A three-agent coordination loop:
|
| 40 |
-
- **Triage Nurse**: Empathetic intake and symptom gathering.
|
| 41 |
-
- **Specialist Doctor**: Constructing differential hypotheses and clinical steps.
|
| 42 |
-
- **Fact-Checker**: Cross-verifying responses against clinical safety guidelines to prevent hallucinations.
|
| 43 |
-
- **Advantage**: Drastically mitigates clinical AI hallucination through collaborative verification.
|
| 44 |
|
| 45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
|
| 47 |
-
|
| 48 |
|
| 49 |
-
-
|
| 50 |
-
-
|
| 51 |
-
-
|
|
|
|
| 52 |
|
| 53 |
-
##
|
| 54 |
|
| 55 |
-
```
|
| 56 |
MediSim/
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
|
|
|
|
|
|
| 64 |
```
|
| 65 |
|
| 66 |
-
##
|
| 67 |
-
|
| 68 |
-
### Backend (FastAPI)
|
| 69 |
-
|
| 70 |
-
1. Navigate to the backend directory:
|
| 71 |
-
```bash
|
| 72 |
-
cd web_app_pro/backend
|
| 73 |
-
```
|
| 74 |
-
2. Install dependencies:
|
| 75 |
-
```bash
|
| 76 |
-
pip install -r requirements.txt
|
| 77 |
-
```
|
| 78 |
-
3. Run the development server:
|
| 79 |
-
```bash
|
| 80 |
-
python main.py
|
| 81 |
-
```
|
| 82 |
-
|
| 83 |
-
### Frontend (React)
|
| 84 |
-
|
| 85 |
-
1. Navigate to the frontend directory:
|
| 86 |
-
```bash
|
| 87 |
-
cd web_app_pro/frontend
|
| 88 |
-
```
|
| 89 |
-
2. Install dependencies:
|
| 90 |
-
```bash
|
| 91 |
-
npm install
|
| 92 |
-
```
|
| 93 |
-
3. Run the development server:
|
| 94 |
-
```bash
|
| 95 |
-
npm run dev
|
| 96 |
-
```
|
| 97 |
-
|
| 98 |
-
## Deployment
|
| 99 |
-
|
| 100 |
-
The project includes a Dockerfile for easy deployment to platforms like Hugging Face Spaces. It serves the React application via FastAPI static mounting.
|
| 101 |
|
| 102 |
-
##
|
| 103 |
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
- **Processing**: A deterministic vision-language fusion approach.
|
| 109 |
-
- Images are processed using a Convolutional Neural Network (CNN).
|
| 110 |
-
- Textual symptoms are processed using a Bidirectional LSTM (biLSTM).
|
| 111 |
-
- Features are aligned via a multimodal fusion layer to output structured diagnoses.
|
| 112 |
-
- **Advantage**: Bypasses the high compute requirements of monolithic Large Multimodal Models (LMMs) and provides distinct interpretability limits.
|
| 113 |
-
|
| 114 |
-
### Feature 2: Multi-Agent Triage & Consultation
|
| 115 |
-
- **Purpose**: To interactively gather patient symptom data and propose verified clinical next steps.
|
| 116 |
-
- **Processing**: A highly structured interactions loop involving three distinct Large Language Model (LLM) agents powered locally or via fast-inference APIs.
|
| 117 |
-
- **Triage Nurse Agent**: Engages patients to gather unstructured symptom descriptions and medical histories.
|
| 118 |
-
- **Specialist Doctor Agent**: Constructs possible differential hypotheses and clinical steps.
|
| 119 |
-
- **Medical Fact-Checker Agent**: Evaluates the specialist's outputs against clinical safety guidelines to actively block generative hallucination or unsafe recommendations.
|
| 120 |
-
|
| 121 |
-
## Project Architecture (Phase 2 Focus)
|
| 122 |
-
|
| 123 |
-
During **Phase 2**, we established the core hypotheses of our system:
|
| 124 |
-
1. Multimodal baseline fusions can compete effectively with heavy LMMs in constrained environments.
|
| 125 |
-
2. A Multi-Agent debate structure drastically mitigates clinical AI hallucination compared to standard single-prompt systems.
|
| 126 |
-
|
| 127 |
-
### Directory Structure
|
| 128 |
```
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
```
|
| 137 |
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
-
|
| 144 |
-
-
|
| 145 |
-
|
| 146 |
-
- **Phase 4: Final Evaluation** - [Pending Phase 3]
|
| 147 |
-
|
| 148 |
-
## Setup and Installation
|
| 149 |
-
*Note: MediSim is currently in the proposal-to-implementation transition phase.*
|
| 150 |
-
>>>>>>> e2fd362 (Finalize Phase 2: Refined report, resized Figure 1, updated tables, and synced deliverables)
|
| 151 |
-
|
| 152 |
-
**Requirements:**
|
| 153 |
-
- Python 3.10+
|
| 154 |
-
- PyTorch (for the Multimodal CNN/biLSTM baselines)
|
| 155 |
-
- LangChain / LlamaIndex (for Multi-Agent orchestration)
|
| 156 |
-
- Streamlit (for the Web Interface)
|
| 157 |
-
|
| 158 |
-
1. **Clone the repository**
|
| 159 |
-
```bash
|
| 160 |
-
git clone https://github.com/shadowsilence94/MediSim.git
|
| 161 |
-
cd MediSim
|
| 162 |
-
```
|
| 163 |
-
2. **Install Dependencies** (Placeholder for the final requirements file)
|
| 164 |
-
```bash
|
| 165 |
-
pip install -r requirements.txt
|
| 166 |
-
```
|
| 167 |
-
3. **Running the Web App** (Scheduled for Phase 3)
|
| 168 |
-
```bash
|
| 169 |
-
streamlit run web_app/app.py
|
| 170 |
-
```
|
| 171 |
-
|
| 172 |
-
<<<<<<< HEAD
|
| 173 |
-
=======
|
| 174 |
-
## Compiling the Phase 2 Report
|
| 175 |
-
The Phase 2 report is written in LaTeX using the ACL template. To compile the raw source:
|
| 176 |
-
1. Ensure you have a TeX distribution (e.g., TeX Live or MiKTeX) installed.
|
| 177 |
-
2. Navigate to `reports/Phase2/source/`.
|
| 178 |
-
3. Run the following sequence from your terminal:
|
| 179 |
-
```bash
|
| 180 |
-
pdflatex report.tex
|
| 181 |
-
bibtex report
|
| 182 |
-
pdflatex report.tex
|
| 183 |
-
pdflatex report.tex
|
| 184 |
-
```
|
| 185 |
-
This will generate the final `report.pdf`.
|
| 186 |
-
|
| 187 |
-
>>>>>>> e2fd362 (Finalize Phase 2: Refined report, resized Figure 1, updated tables, and synced deliverables)
|
| 188 |
## Team Members
|
| 189 |
-
|
| 190 |
- Htut Ko Ko (st126010)
|
| 191 |
- Imtiaz Ahmad (st126685)
|
| 192 |
- Michael R. Lacar (st126161)
|
| 193 |
- Aashutosh Raut (st126438)
|
| 194 |
|
| 195 |
-
<<<<<<< HEAD
|
| 196 |
## References
|
| 197 |
|
| 198 |
-
|
| 199 |
-
=======
|
| 200 |
-
## References & Readings
|
| 201 |
-
The architectural choices for MediSim are modeled after state-of-the-art papers exclusively retrieved from the ACL Anthology, emphasizing safe conversation generation and lightweight clinical representation learning. Refer to `reports/Phase2/report.pdf` for the full methodology and literature review.
|
| 202 |
-
>>>>>>> origin/main
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
title: MediSim
|
| 3 |
emoji: "🩺"
|
|
|
|
| 10 |
|
| 11 |
# MediSim: Multimodal Diagnostic and Agentic Triage System
|
| 12 |
|
| 13 |
+
MediSim is an AI-powered medical assistant web application designed to safely process health inputs. It is developed as an NLP research project focused on reducing clinical hallucination in generative healthcare applications using hybrid learning pipelines and multi-agent orchestration.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
## Core Features
|
| 16 |
|
|
|
|
|
|
|
|
|
|
| 17 |
### 1. Multimodal Diagnostic Assistant
|
| 18 |
|
| 19 |
+
- Purpose: Provides preliminary diagnostic assessments by combining medical image data and symptom descriptions.
|
| 20 |
+
- Input: Medical scans (for example, chest X-ray) plus symptom text.
|
| 21 |
+
- Architecture:
|
| 22 |
+
- Vision encoder: ResNet-18.
|
| 23 |
+
- Text encoder: biLSTM.
|
| 24 |
+
- Fusion head: late-fusion classifier.
|
| 25 |
+
- Advantage: Better reliability and lower compute demands than large generic multimodal models in this domain.
|
|
|
|
|
|
|
| 26 |
|
| 27 |
+
### 2. Agentic Triage and Consultation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
+
- Purpose: Interactively gathers symptoms and provides verified clinical guidance.
|
| 30 |
+
- Processing: Three-agent collaboration loop:
|
| 31 |
+
- Triage Nurse: empathic intake and symptom collection.
|
| 32 |
+
- Specialist Doctor: differential reasoning and next-step planning.
|
| 33 |
+
- Fact Checker: verifies outputs against safety constraints.
|
| 34 |
+
- Advantage: Reduces hallucination risk through explicit multi-agent verification.
|
| 35 |
|
| 36 |
+
## Architecture
|
| 37 |
|
| 38 |
+
- Frontend: React + TypeScript + Vite.
|
| 39 |
+
- Backend: FastAPI + PyTorch + LangChain orchestration.
|
| 40 |
+
- Authentication and Storage: Firebase Auth + Firestore.
|
| 41 |
+
- Deployment target: Hugging Face Space (Docker).
|
| 42 |
|
| 43 |
+
## Directory Layout
|
| 44 |
|
| 45 |
+
```text
|
| 46 |
MediSim/
|
| 47 |
+
|- web_app_pro/ # Production web application
|
| 48 |
+
| |- frontend/ # React + Vite app
|
| 49 |
+
| |- backend/ # FastAPI service and model logic
|
| 50 |
+
|- web_app/ # Legacy app entrypoint used for HF runtime
|
| 51 |
+
|- data/ # Trained weights and supporting assets
|
| 52 |
+
|- notebooks/ # Training and experimentation notebooks
|
| 53 |
+
|- reports/ # Project reports and writeups
|
| 54 |
+
|- scripts/ # Deployment and utility scripts
|
| 55 |
+
`- README.md
|
| 56 |
```
|
| 57 |
|
| 58 |
+
## Local Development
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
+
### Backend
|
| 61 |
|
| 62 |
+
```bash
|
| 63 |
+
cd web_app_pro/backend
|
| 64 |
+
pip install -r requirements.txt
|
| 65 |
+
python main.py
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
```
|
| 67 |
+
|
| 68 |
+
### Frontend
|
| 69 |
+
|
| 70 |
+
```bash
|
| 71 |
+
cd web_app_pro/frontend
|
| 72 |
+
npm install
|
| 73 |
+
npm run dev
|
| 74 |
```
|
| 75 |
|
| 76 |
+
## Hugging Face Deployment Notes
|
| 77 |
+
|
| 78 |
+
This repository includes:
|
| 79 |
+
|
| 80 |
+
- A Docker-based Space configuration.
|
| 81 |
+
- Space runtime entrypoint through `web_app/app.py`.
|
| 82 |
+
- Environment-driven Firebase and backend configuration.
|
| 83 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 84 |
## Team Members
|
| 85 |
+
|
| 86 |
- Htut Ko Ko (st126010)
|
| 87 |
- Imtiaz Ahmad (st126685)
|
| 88 |
- Michael R. Lacar (st126161)
|
| 89 |
- Aashutosh Raut (st126438)
|
| 90 |
|
|
|
|
| 91 |
## References
|
| 92 |
|
| 93 |
+
See project reports under `reports/` for methodology, literature review, and evaluation details.
|
|
|
|
|
|
|
|
|
|
|
|