Spaces:
Sleeping
Sleeping
| # Medical Policy RAG Chatbot on Hugging Face Spaces | |
| This Space runs a FastAPI application that provides a Retrieval‑Augmented Generation (RAG) chatbot for medical policies. The API is exposed via the built‑in Swagger UI (`/docs`). No custom frontend is required – the user can interact directly with the API using tools like `curl` or the Swagger UI. | |
| ## How it works | |
| - Documents (PDFs, etc.) are pre‑processed into a FAISS index (`data/faiss.index`) and a metadata pickle (`data/metadata.pkl`). | |
| - Queries are embedded with the `intfloat/e5-base-v2` model, searched in the FAISS index, and the top‑k chunks are fed to `google/flan-t5-large` for answer generation. | |
| - Optional Redis caching is used for scenario‑based follow‑up confirmations. | |
| ## Deploying on Hugging Face Spaces (free) | |
| 1. **Create a new Space** on huggingface.co → *New Space* → select *Docker* as the SDK. | |
| 2. **Clone the Space repository** locally: | |
| ```bash | |
| git clone https://huggingface.co/spaces/<your-username>/medical-rag-chatbot | |
| cd medical-rag-chatbot | |
| ``` | |
| 3. **Copy the project files** into the repository (the `medical_rag_chatbot` folder you have locally). Ensure the following structure: | |
| ``` | |
| . | |
| ├─ Dockerfile # (created automatically) | |
| ├─ main.py # FastAPI entry point | |
| ├─ requirements.txt # Python dependencies | |
| ├─ README.md # Project description (this file) | |
| ├─ space.md # Short description shown on the Space page | |
| └─ data/ | |
| ├─ faiss.index | |
| └─ metadata.pkl | |
| ``` | |
| 4. **Commit and push**: | |
| ```bash | |
| git add . | |
| git commit -m "Add RAG chatbot" | |
| git push | |
| ``` | |
| 5. The Space will automatically build the Docker image using the provided `Dockerfile`. When the build finishes, the FastAPI server starts and the Swagger UI becomes available at `https://<your-username>.hf.space/docs`. | |
| ## Deploying on Google Colab (free) | |
| You can also run the chatbot in a Colab notebook and expose it publicly with **ngrok**. | |
| ```python | |
| # 1️⃣ Install dependencies (Linux environment – faiss works out of the box) | |
| !pip install -r requirements.txt | |
| # 2️⃣ Install ngrok (for public URL) | |
| !pip install pyngrok | |
| # 3️⃣ Start the FastAPI server in the background | |
| import subprocess, time, os | |
| proc = subprocess.Popen(["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"], cwd="/content/medical_rag_chatbot") | |
| # 4️⃣ Give the server a moment to start | |
| time.sleep(5) | |
| # 5️⃣ Open an ngrok tunnel | |
| from pyngrok import ngrok | |
| public_url = ngrok.connect(8000, "http") | |
| print("Public URL:", public_url) | |
| ``` | |
| After running the cell, click the printed URL – it will open the Swagger UI where you can test the `/chat` endpoint. | |
| ## Notes & Tips | |
| - **FAISS compatibility**: The `faiss-cpu` wheel works on Linux (Hugging Face Spaces, Colab). The earlier Windows‑specific install error is irrelevant for these deployments. | |
| - **Redis**: If you do not have a Redis server, the chatbot will still work – the caching logic simply becomes a no‑op. | |
| - **Data size**: Keep the FAISS index under ~200 MB to stay within the free Space limits. | |
| - **Environment variables**: You can set `REDIS_URL`, `FAISS_INDEX_PATH`, etc., in the Space settings under *Environment variables*. | |
| Enjoy your free, production‑ready RAG chatbot! | |