Spaces:

mrshibly
/

MedicalRAG

Sleeping

File size: 3,395 Bytes
# Medical Policy RAG Chatbot on Hugging Face Spaces

This Space runs a FastAPI application that provides a Retrieval‑Augmented Generation (RAG) chatbot for medical policies.  The API is exposed via the built‑in Swagger UI (`/docs`).  No custom frontend is required – the user can interact directly with the API using tools like `curl` or the Swagger UI.

## How it works
- Documents (PDFs, etc.) are pre‑processed into a FAISS index (`data/faiss.index`) and a metadata pickle (`data/metadata.pkl`).
- Queries are embedded with the `intfloat/e5-base-v2` model, searched in the FAISS index, and the top‑k chunks are fed to `google/flan-t5-large` for answer generation.
- Optional Redis caching is used for scenario‑based follow‑up confirmations.

## Deploying on Hugging Face Spaces (free)
1. **Create a new Space** on huggingface.co → *New Space* → select *Docker* as the SDK.
2. **Clone the Space repository** locally:
   ```bash

   git clone https://huggingface.co/spaces/<your-username>/medical-rag-chatbot

   cd medical-rag-chatbot

   ```
3. **Copy the project files** into the repository (the `medical_rag_chatbot` folder you have locally). Ensure the following structure:
   ```

   .

   ├─ Dockerfile          # (created automatically)

   ├─ main.py             # FastAPI entry point

   ├─ requirements.txt    # Python dependencies

   ├─ README.md           # Project description (this file)

   ├─ space.md            # Short description shown on the Space page

   └─ data/

       ├─ faiss.index

       └─ metadata.pkl

   ```
4. **Commit and push**:
   ```bash

   git add .

   git commit -m "Add RAG chatbot"

   git push

   ```
5. The Space will automatically build the Docker image using the provided `Dockerfile`.  When the build finishes, the FastAPI server starts and the Swagger UI becomes available at `https://<your-username>.hf.space/docs`.

## Deploying on Google Colab (free)
You can also run the chatbot in a Colab notebook and expose it publicly with **ngrok**.
```python

# 1️⃣ Install dependencies (Linux environment – faiss works out of the box)

!pip install -r requirements.txt



# 2️⃣ Install ngrok (for public URL)

!pip install pyngrok



# 3️⃣ Start the FastAPI server in the background

import subprocess, time, os

proc = subprocess.Popen(["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"], cwd="/content/medical_rag_chatbot")



# 4️⃣ Give the server a moment to start

time.sleep(5)



# 5️⃣ Open an ngrok tunnel

from pyngrok import ngrok

public_url = ngrok.connect(8000, "http")

print("Public URL:", public_url)

```
After running the cell, click the printed URL – it will open the Swagger UI where you can test the `/chat` endpoint.

## Notes & Tips
- **FAISS compatibility**: The `faiss-cpu` wheel works on Linux (Hugging Face Spaces, Colab). The earlier Windows‑specific install error is irrelevant for these deployments.
- **Redis**: If you do not have a Redis server, the chatbot will still work – the caching logic simply becomes a no‑op.
- **Data size**: Keep the FAISS index under ~200 MB to stay within the free Space limits.
- **Environment variables**: You can set `REDIS_URL`, `FAISS_INDEX_PATH`, etc., in the Space settings under *Environment variables*.

Enjoy your free, production‑ready RAG chatbot!