MedicalRAG / space.md
mrshibly's picture
Upload 4 files
1831039 verified

Medical Policy RAG Chatbot on Hugging Face Spaces

This Space runs a FastAPI application that provides a Retrieval‑Augmented Generation (RAG) chatbot for medical policies. The API is exposed via the built‑in Swagger UI (/docs). No custom frontend is required – the user can interact directly with the API using tools like curl or the Swagger UI.

How it works

  • Documents (PDFs, etc.) are pre‑processed into a FAISS index (data/faiss.index) and a metadata pickle (data/metadata.pkl).
  • Queries are embedded with the intfloat/e5-base-v2 model, searched in the FAISS index, and the top‑k chunks are fed to google/flan-t5-large for answer generation.
  • Optional Redis caching is used for scenario‑based follow‑up confirmations.

Deploying on Hugging Face Spaces (free)

  1. Create a new Space on huggingface.co → New Space → select Docker as the SDK.
  2. Clone the Space repository locally:
    git clone https://huggingface.co/spaces/<your-username>/medical-rag-chatbot
    cd medical-rag-chatbot
    
  3. Copy the project files into the repository (the medical_rag_chatbot folder you have locally). Ensure the following structure:
    .
    ├─ Dockerfile          # (created automatically)
    ├─ main.py             # FastAPI entry point
    ├─ requirements.txt    # Python dependencies
    ├─ README.md           # Project description (this file)
    ├─ space.md            # Short description shown on the Space page
    └─ data/
        ├─ faiss.index
        └─ metadata.pkl
    
  4. Commit and push:
    git add .
    git commit -m "Add RAG chatbot"
    git push
    
  5. The Space will automatically build the Docker image using the provided Dockerfile. When the build finishes, the FastAPI server starts and the Swagger UI becomes available at https://<your-username>.hf.space/docs.

Deploying on Google Colab (free)

You can also run the chatbot in a Colab notebook and expose it publicly with ngrok.

# 1️⃣ Install dependencies (Linux environment – faiss works out of the box)
!pip install -r requirements.txt

# 2️⃣ Install ngrok (for public URL)
!pip install pyngrok

# 3️⃣ Start the FastAPI server in the background
import subprocess, time, os
proc = subprocess.Popen(["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"], cwd="/content/medical_rag_chatbot")

# 4️⃣ Give the server a moment to start
time.sleep(5)

# 5️⃣ Open an ngrok tunnel
from pyngrok import ngrok
public_url = ngrok.connect(8000, "http")
print("Public URL:", public_url)

After running the cell, click the printed URL – it will open the Swagger UI where you can test the /chat endpoint.

Notes & Tips

  • FAISS compatibility: The faiss-cpu wheel works on Linux (Hugging Face Spaces, Colab). The earlier Windows‑specific install error is irrelevant for these deployments.
  • Redis: If you do not have a Redis server, the chatbot will still work – the caching logic simply becomes a no‑op.
  • Data size: Keep the FAISS index under ~200 MB to stay within the free Space limits.
  • Environment variables: You can set REDIS_URL, FAISS_INDEX_PATH, etc., in the Space settings under Environment variables.

Enjoy your free, production‑ready RAG chatbot!