Spaces:
Sleeping
Sleeping
Medical Policy RAG Chatbot on Hugging Face Spaces
This Space runs a FastAPI application that provides a Retrieval‑Augmented Generation (RAG) chatbot for medical policies. The API is exposed via the built‑in Swagger UI (/docs). No custom frontend is required – the user can interact directly with the API using tools like curl or the Swagger UI.
How it works
- Documents (PDFs, etc.) are pre‑processed into a FAISS index (
data/faiss.index) and a metadata pickle (data/metadata.pkl). - Queries are embedded with the
intfloat/e5-base-v2model, searched in the FAISS index, and the top‑k chunks are fed togoogle/flan-t5-largefor answer generation. - Optional Redis caching is used for scenario‑based follow‑up confirmations.
Deploying on Hugging Face Spaces (free)
- Create a new Space on huggingface.co → New Space → select Docker as the SDK.
- Clone the Space repository locally:
git clone https://huggingface.co/spaces/<your-username>/medical-rag-chatbot cd medical-rag-chatbot - Copy the project files into the repository (the
medical_rag_chatbotfolder you have locally). Ensure the following structure:. ├─ Dockerfile # (created automatically) ├─ main.py # FastAPI entry point ├─ requirements.txt # Python dependencies ├─ README.md # Project description (this file) ├─ space.md # Short description shown on the Space page └─ data/ ├─ faiss.index └─ metadata.pkl - Commit and push:
git add . git commit -m "Add RAG chatbot" git push - The Space will automatically build the Docker image using the provided
Dockerfile. When the build finishes, the FastAPI server starts and the Swagger UI becomes available athttps://<your-username>.hf.space/docs.
Deploying on Google Colab (free)
You can also run the chatbot in a Colab notebook and expose it publicly with ngrok.
# 1️⃣ Install dependencies (Linux environment – faiss works out of the box)
!pip install -r requirements.txt
# 2️⃣ Install ngrok (for public URL)
!pip install pyngrok
# 3️⃣ Start the FastAPI server in the background
import subprocess, time, os
proc = subprocess.Popen(["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"], cwd="/content/medical_rag_chatbot")
# 4️⃣ Give the server a moment to start
time.sleep(5)
# 5️⃣ Open an ngrok tunnel
from pyngrok import ngrok
public_url = ngrok.connect(8000, "http")
print("Public URL:", public_url)
After running the cell, click the printed URL – it will open the Swagger UI where you can test the /chat endpoint.
Notes & Tips
- FAISS compatibility: The
faiss-cpuwheel works on Linux (Hugging Face Spaces, Colab). The earlier Windows‑specific install error is irrelevant for these deployments. - Redis: If you do not have a Redis server, the chatbot will still work – the caching logic simply becomes a no‑op.
- Data size: Keep the FAISS index under ~200 MB to stay within the free Space limits.
- Environment variables: You can set
REDIS_URL,FAISS_INDEX_PATH, etc., in the Space settings under Environment variables.
Enjoy your free, production‑ready RAG chatbot!