| # AliveAI Medical RAG Chatbot | |
| Fully local medical RAG chatbot built with FastAPI, ChromaDB, SentenceTransformers, and Ollama. | |
| ## Stack | |
| - Knowledge-base embeddings: `BAAI/bge-base-en-v1.5` | |
| - NLP routing embeddings: `sentence-transformers/all-MiniLM-L6-v2` | |
| - Vector database: ChromaDB with cosine similarity | |
| - Optional vector backend: Pinecone (auto-selected when Pinecone env vars are set) | |
| - Health chat model: Hugging Face (`aaditya/Llama3-OpenBioLLM-8B` by default) | |
| - Dataset: `keivalya/MedQuad-MedicalQnADataset` | |
| ## Project Layout | |
| ```text | |
| app/ | |
| agent/ | |
| db/ | |
| nlp/ | |
| scripts/ | |
| data/ | |
| chroma_db/ | |
| test_rag.py | |
| ``` | |
| ## Setup | |
| ```bash | |
| python3.12 -m pip install -r requirements.txt | |
| python3.12 scripts/download_dataset.py | |
| python3.12 scripts/download_models.py # optional fallback if Hugging Face TLS fails in Python | |
| python3.12 scripts/prepare_dataset.py | |
| python3.12 scripts/ingest.py | |
| python3.12 -c "from test_rag import run_all_tests; run_all_tests()" | |
| python3.12 -m uvicorn app.main:app --reload --port 8000 | |
| ``` | |
| ## Pinecone Switch | |
| Set these env vars to use Pinecone instead of local ChromaDB: | |
| ```bash | |
| export ALIVEAI_PINECONE_API_KEY=your_key | |
| export ALIVEAI_PINECONE_INDEX_NAME=your_index_name | |
| # optional if your key requires host-level targeting: | |
| export ALIVEAI_PINECONE_INDEX_HOST=your_index_host | |
| # optional to isolate AliveAI data inside a shared index: | |
| export ALIVEAI_PINECONE_NAMESPACE=aliveai | |
| # optional backend selection: auto | pinecone | chroma | |
| export ALIVEAI_VECTOR_BACKEND=auto | |
| # optional explicit embedding dimension validation for Pinecone | |
| export ALIVEAI_EMBEDDING_DIMENSION=768 | |
| ``` | |
| When Pinecone vars are present, the DB adapter uses Pinecone automatically. | |
| When they are not present, it falls back to local ChromaDB. | |
| Pinecone index requirements for this project: | |
| - `metric=cosine` | |
| - `dimension` must match the active embedding model output dimension | |
| (`ALIVEAI_EMBEDDING_DIMENSION` can be set explicitly, otherwise inferred from the model) | |
| After enabling Pinecone, run: | |
| ```bash | |
| python3.12 scripts/ingest.py --force | |
| python3.12 -c "from test_rag import run_all_tests; run_all_tests()" | |
| python3.12 -m uvicorn app.main:app --reload --port 8000 | |
| ``` | |
| ## Endpoints | |
| - `POST /chat` | |
| - `GET /health` | |
| - `POST /reset/{session_id}` | |
| - `GET /validate?text1=chest+hurts&text2=chest+pain` | |
| - `GET /ingest/schema` | |
| - `POST /ingest/text` | |
| - `POST /ingest/file` (multipart; supports `.txt`, `.pdf`, `.doc`, `.docx`) | |
| - `GET /ingest/task/{task_id}` | |
| Note: ingestion endpoints store uploaded content into local ChromaDB (`medical_kb` collection). | |
| ## Background Ingestion (Celery) | |
| ```bash | |
| export ALIVEAI_CELERY_BROKER_URL=redis://localhost:6379/0 | |
| export ALIVEAI_CELERY_RESULT_BACKEND=redis://localhost:6379/0 | |
| celery -A app.celery_app.celery_app worker --loglevel=info | |
| ``` | |
| Use `POST /ingest/text` or `POST /ingest/file` with `async_process=true` (default) to queue ingestion in background. | |
| ## RAG Parameters (Env) | |
| ```bash | |
| export ALIVEAI_CHUNK_SIZE=700 | |
| export ALIVEAI_CHUNK_OVERLAP=150 | |
| export ALIVEAI_RAG_TOP_K=5 | |
| export ALIVEAI_LLM_TOP_P=0.9 | |
| export ALIVEAI_LLM_TOP_K=40 | |
| export ALIVEAI_HEALTH_MODEL=aaditya/Llama3-OpenBioLLM-8B | |
| export ALIVEAI_HEALTH_MODEL_PROVIDER=hf | |
| export ALIVEAI_HF_MAX_NEW_TOKENS=220 | |
| export ALIVEAI_LLM_TEMPERATURE=0.2 | |
| export ALIVEAI_OCR_ENABLED=true | |
| export ALIVEAI_OCR_LANG=en | |
| export ALIVEAI_OCR_MIN_PDF_TEXT=120 | |
| ``` | |
| ## Notes | |
| - All models are loaded lazily and instantiated as singletons. | |
| - Emergency intent is short-circuited before retrieval. | |
| - ChromaDB persists to the local `chroma_db/` directory. | |
| - If Python cannot download Hugging Face assets in your environment, `scripts/download_models.py` mirrors the two SentenceTransformer model repos into `models/`. | |
| - `HealthAgent` uses Hugging Face generation first (`aaditya/Llama3-OpenBioLLM-8B` default), then extractive context fallback if generation is unavailable. | |
| - PDF OCR fallback is enabled for scanned PDFs when PaddleOCR + pypdfium2 are installed. | |
| - If your default `python3` is a different interpreter (for example Python 3.13 via conda), use `python3.12` so installed dependencies match this project. | |
Xet Storage Details
- Size:
- 4.08 kB
- Xet hash:
- e99af66460a1c4a0ccab4d99e28da2036e8e85d5559a3049cb754fbfc15138e9
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.