Spaces:
Sleeping
Sleeping
File size: 2,747 Bytes
394310f dc235a0 394310f dc235a0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
---
title: MedChatBot
emoji: 💊
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
---
# MedChatBot
A medical chatbot application that uses RAG (Retrieval-Augmented Generation) architecture to answer medical questions based on medical literature. The system combines **Google Gemini 2.5 Pro** as the language model with **Pinecone** vector database for efficient document retrieval.
## Technology Stack
- **Backend**: Flask
- **Language Model**: Google Gemini 2.5 Pro
- **Vector Database**: Pinecone
- **Embeddings**: HuggingFace sentence-transformers (all-MiniLM-L6-v2)
- **Document Processing**: LangChain, PyPDF
- **Frontend**: HTML/CSS/JavaScript
## Installation & Setup
### Step 1: Clone the Repository
```bash
git clone https://github.com/TMTien31/MedChatBot.git
cd MedChatBot
```
### Step 2: Create Virtual Environment
```bash
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
```
### Step 3: Install Dependencies
```bash
pip install -r requirements.txt
```
### Step 4: Get API Keys
#### Google Gemini API Key:
1. Go to [Google AI Studio](https://makersuite.google.com/app/apikey)
2. Create a new API key
3. Copy the generated key
#### Pinecone API Key:
1. Sign up at [Pinecone](https://www.pinecone.io/)
2. Go to your dashboard
3. Copy your API key from the "API Keys" section
### Step 5: Create Environment File
Create a `.env` file in the project root directory:
```bash
# Create .env file
touch .env # On macOS/Linux
# or create manually on Windows
```
Add your API keys to the `.env` file:
```env
PINECONE_API_KEY=your_pinecone_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
```
### Step 6: Prepare Medical Documents
- Place your PDF medical documents in the `Data/` folder
- The project includes "Gale Encyclopedia of Medicine Vol. 1 (A-B).pdf" by default
- You can add more medical PDFs to expand the knowledge base
### Step 7: Create Vector Index (Run Once)
**Important**: This step only needs to be run once initially, or whenever you add new documents to the `Data/` folder.
```bash
python store_index.py
```
This script will:
- Read all PDF files from the `Data/` directory
- Split text into 500-character chunks with 20-character overlap
- Generate embeddings using sentence-transformers
- Create and populate a Pinecone index named "medchatbot"
**Note**: This process may take several minutes depending on the size of your documents.
## Running the Application
### Start the Flask Server
```bash
python app.py
```
### Access the Application
1. Open your web browser
2. Navigate to: `http://0.0.0.0:8080` or `http://localhost:8080`
3. You should see the medical chatbot interface
|