Buckets:

meet4150
/

ALIV_AI

4.08 kB

	# AliveAI Medical RAG Chatbot

	Fully local medical RAG chatbot built with FastAPI, ChromaDB, SentenceTransformers, and Ollama.

	## Stack

	- Knowledge-base embeddings: `BAAI/bge-base-en-v1.5`
	- NLP routing embeddings: `sentence-transformers/all-MiniLM-L6-v2`
	- Vector database: ChromaDB with cosine similarity
	- Optional vector backend: Pinecone (auto-selected when Pinecone env vars are set)
	- Health chat model: Hugging Face (`aaditya/Llama3-OpenBioLLM-8B` by default)
	- Dataset: `keivalya/MedQuad-MedicalQnADataset`

	## Project Layout

	```text
	app/
	agent/
	db/
	nlp/
	scripts/
	data/
	chroma_db/
	test_rag.py
	```

	## Setup

	```bash
	python3.12 -m pip install -r requirements.txt
	python3.12 scripts/download_dataset.py
	python3.12 scripts/download_models.py # optional fallback if Hugging Face TLS fails in Python
	python3.12 scripts/prepare_dataset.py
	python3.12 scripts/ingest.py
	python3.12 -c "from test_rag import run_all_tests; run_all_tests()"
	python3.12 -m uvicorn app.main:app --reload --port 8000
	```

	## Pinecone Switch

	Set these env vars to use Pinecone instead of local ChromaDB:

	```bash
	export ALIVEAI_PINECONE_API_KEY=your_key
	export ALIVEAI_PINECONE_INDEX_NAME=your_index_name
	# optional if your key requires host-level targeting:
	export ALIVEAI_PINECONE_INDEX_HOST=your_index_host
	# optional to isolate AliveAI data inside a shared index:
	export ALIVEAI_PINECONE_NAMESPACE=aliveai
	# optional backend selection: auto \| pinecone \| chroma
	export ALIVEAI_VECTOR_BACKEND=auto
	# optional explicit embedding dimension validation for Pinecone
	export ALIVEAI_EMBEDDING_DIMENSION=768
	```

	When Pinecone vars are present, the DB adapter uses Pinecone automatically.
	When they are not present, it falls back to local ChromaDB.

	Pinecone index requirements for this project:
	- `metric=cosine`
	- `dimension` must match the active embedding model output dimension
	(`ALIVEAI_EMBEDDING_DIMENSION` can be set explicitly, otherwise inferred from the model)

	After enabling Pinecone, run:

	```bash
	python3.12 scripts/ingest.py --force
	python3.12 -c "from test_rag import run_all_tests; run_all_tests()"
	python3.12 -m uvicorn app.main:app --reload --port 8000
	```

	## Endpoints

	- `POST /chat`
	- `GET /health`
	- `POST /reset/{session_id}`
	- `GET /validate?text1=chest+hurts&text2=chest+pain`
	- `GET /ingest/schema`
	- `POST /ingest/text`
	- `POST /ingest/file` (multipart; supports `.txt`, `.pdf`, `.doc`, `.docx`)
	- `GET /ingest/task/{task_id}`

	Note: ingestion endpoints store uploaded content into local ChromaDB (`medical_kb` collection).

	## Background Ingestion (Celery)

	```bash
	export ALIVEAI_CELERY_BROKER_URL=redis://localhost:6379/0
	export ALIVEAI_CELERY_RESULT_BACKEND=redis://localhost:6379/0
	celery -A app.celery_app.celery_app worker --loglevel=info
	```

	Use `POST /ingest/text` or `POST /ingest/file` with `async_process=true` (default) to queue ingestion in background.

	## RAG Parameters (Env)

	```bash
	export ALIVEAI_CHUNK_SIZE=700
	export ALIVEAI_CHUNK_OVERLAP=150
	export ALIVEAI_RAG_TOP_K=5
	export ALIVEAI_LLM_TOP_P=0.9
	export ALIVEAI_LLM_TOP_K=40
	export ALIVEAI_HEALTH_MODEL=aaditya/Llama3-OpenBioLLM-8B
	export ALIVEAI_HEALTH_MODEL_PROVIDER=hf
	export ALIVEAI_HF_MAX_NEW_TOKENS=220
	export ALIVEAI_LLM_TEMPERATURE=0.2
	export ALIVEAI_OCR_ENABLED=true
	export ALIVEAI_OCR_LANG=en
	export ALIVEAI_OCR_MIN_PDF_TEXT=120
	```

	## Notes

	- All models are loaded lazily and instantiated as singletons.
	- Emergency intent is short-circuited before retrieval.
	- ChromaDB persists to the local `chroma_db/` directory.
	- If Python cannot download Hugging Face assets in your environment, `scripts/download_models.py` mirrors the two SentenceTransformer model repos into `models/`.
	- `HealthAgent` uses Hugging Face generation first (`aaditya/Llama3-OpenBioLLM-8B` default), then extractive context fallback if generation is unavailable.
	- PDF OCR fallback is enabled for scanned PDFs when PaddleOCR + pypdfium2 are installed.
	- If your default `python3` is a different interpreter (for example Python 3.13 via conda), use `python3.12` so installed dependencies match this project.

Xet Storage Details

Size:: 4.08 kB
Xet hash:: e99af66460a1c4a0ccab4d99e28da2036e8e85d5559a3049cb754fbfc15138e9

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.