Spaces:

mrshibly
/

MedicalRAG

Sleeping

App Files Files Community

MedicalRAG / space.md

mrshibly

Upload 4 files

1831039 verified about 1 month ago

preview code

raw

history blame contribute delete

3.4 kB

	# Medical Policy RAG Chatbot on Hugging Face Spaces

	This Space runs a FastAPI application that provides a Retrieval‑Augmented Generation (RAG) chatbot for medical policies. The API is exposed via the built‑in Swagger UI (`/docs`). No custom frontend is required – the user can interact directly with the API using tools like `curl` or the Swagger UI.

	## How it works
	- Documents (PDFs, etc.) are pre‑processed into a FAISS index (`data/faiss.index`) and a metadata pickle (`data/metadata.pkl`).
	- Queries are embedded with the `intfloat/e5-base-v2` model, searched in the FAISS index, and the top‑k chunks are fed to `google/flan-t5-large` for answer generation.
	- Optional Redis caching is used for scenario‑based follow‑up confirmations.

	## Deploying on Hugging Face Spaces (free)
	1. Create a new Space on huggingface.co → New Space → select Docker as the SDK.
	2. Clone the Space repository locally:
	```bash
	git clone https://huggingface.co/spaces/<your-username>/medical-rag-chatbot
	cd medical-rag-chatbot
	```
	3. Copy the project files into the repository (the `medical_rag_chatbot` folder you have locally). Ensure the following structure:
	```
	.
	├─ Dockerfile # (created automatically)
	├─ main.py # FastAPI entry point
	├─ requirements.txt # Python dependencies
	├─ README.md # Project description (this file)
	├─ space.md # Short description shown on the Space page
	└─ data/
	├─ faiss.index
	└─ metadata.pkl
	```
	4. Commit and push:
	```bash
	git add .
	git commit -m "Add RAG chatbot"
	git push
	```
	5. The Space will automatically build the Docker image using the provided `Dockerfile`. When the build finishes, the FastAPI server starts and the Swagger UI becomes available at `https://<your-username>.hf.space/docs`.

	## Deploying on Google Colab (free)
	You can also run the chatbot in a Colab notebook and expose it publicly with ngrok.
	```python
	# 1️⃣ Install dependencies (Linux environment – faiss works out of the box)
	!pip install -r requirements.txt

	# 2️⃣ Install ngrok (for public URL)
	!pip install pyngrok

	# 3️⃣ Start the FastAPI server in the background
	import subprocess, time, os
	proc = subprocess.Popen(["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"], cwd="/content/medical_rag_chatbot")

	# 4️⃣ Give the server a moment to start
	time.sleep(5)

	# 5️⃣ Open an ngrok tunnel
	from pyngrok import ngrok
	public_url = ngrok.connect(8000, "http")
	print("Public URL:", public_url)
	```
	After running the cell, click the printed URL – it will open the Swagger UI where you can test the `/chat` endpoint.

	## Notes & Tips
	- FAISS compatibility: The `faiss-cpu` wheel works on Linux (Hugging Face Spaces, Colab). The earlier Windows‑specific install error is irrelevant for these deployments.
	- Redis: If you do not have a Redis server, the chatbot will still work – the caching logic simply becomes a no‑op.
	- Data size: Keep the FAISS index under ~200 MB to stay within the free Space limits.
	- Environment variables: You can set `REDIS_URL`, `FAISS_INDEX_PATH`, etc., in the Space settings under Environment variables.

	Enjoy your free, production‑ready RAG chatbot!