|
download
raw
2.76 kB
# AliveAI Medical RAG Chatbot
Fully local medical RAG chatbot built with FastAPI, ChromaDB, SentenceTransformers, and Ollama.
## Stack
- Knowledge-base embeddings: `BAAI/bge-base-en-v1.5`
- NLP routing embeddings: `sentence-transformers/all-MiniLM-L6-v2`
- Vector database: ChromaDB with cosine similarity
- Optional vector backend: Pinecone (auto-selected when Pinecone env vars are set)
- Local chat model: Ollama (`llama3.2:3b` by default, `OpenMeditron/Meditron3-8B` when available)
- Dataset: `keivalya/MedQuad-MedicalQnADataset`
## Project Layout
```text
app/
agent/
db/
nlp/
scripts/
data/
chroma_db/
test_rag.py
```
## Setup
```bash
python3.12 -m pip install -r requirements.txt
ollama pull llama3.2:3b
python3.12 scripts/download_dataset.py
python3.12 scripts/download_models.py # optional fallback if Hugging Face TLS fails in Python
python3.12 scripts/download_hf_chat_model.py # optional local HF fallback for chat when Ollama is unavailable
python3.12 scripts/prepare_dataset.py
python3.12 scripts/ingest.py
python3.12 -c "from test_rag import run_all_tests; run_all_tests()"
python3.12 -m uvicorn app.main:app --reload --port 8000
```
## Pinecone Switch
Set these env vars to use Pinecone instead of local ChromaDB:
```bash
export ALIVEAI_PINECONE_API_KEY=your_key
export ALIVEAI_PINECONE_INDEX_NAME=your_index_name
# optional if your key requires host-level targeting:
export ALIVEAI_PINECONE_INDEX_HOST=your_index_host
# optional to isolate AliveAI data inside a shared index:
export ALIVEAI_PINECONE_NAMESPACE=aliveai
```
When Pinecone vars are present, the DB adapter uses Pinecone automatically.
When they are not present, it falls back to local ChromaDB.
Pinecone index requirements for this project:
- `dimension=768`
- `metric=cosine`
After enabling Pinecone, run:
```bash
python3.12 scripts/ingest.py --force
python3.12 -c "from test_rag import run_all_tests; run_all_tests()"
python3.12 -m uvicorn app.main:app --reload --port 8000
```
## Endpoints
- `POST /chat`
- `GET /health`
- `POST /reset/{session_id}`
- `GET /validate?text1=chest+hurts&text2=chest+pain`
## Notes
- All models are loaded lazily and instantiated as singletons.
- Emergency intent is short-circuited before retrieval.
- ChromaDB persists to the local `chroma_db/` directory.
- If Python cannot download Hugging Face assets in your environment, `scripts/download_models.py` mirrors the two SentenceTransformer model repos into `models/`.
- `HealthAgent` tries Ollama first, then Hugging Face local generation, then extractive context fallback (`ALIVEAI_CHAT_FALLBACK=auto`, default).
- If your default `python3` is a different interpreter (for example Python 3.13 via conda), use `python3.12` so installed dependencies match this project.

Xet Storage Details

Size:
2.76 kB
·
Xet hash:
64ba92db646f6971c93789364e38a6d09931e5c5aa8f096fccbb3feec81df500

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.