Buckets:

meet4150
/

alive_pine

973 MB

207 files

Updated 3 months ago

Ctrl+K

Name	Size	Uploaded	Xet hash
.git		3 months ago	86 items
app		3 months ago	10 items
data		3 months ago	1 items
data_from_bucket		3 months ago	1 items
hf_export		3 months ago	20 items
scripts		3 months ago	5 items
.gitignore	309 Bytes xet	3 months ago	f8fd7248
README.md	2.76 kB xet	3 months ago	64ba92db
requirements.txt	131 Bytes xet	3 months ago	ec9b5b82
test_rag.py	1.65 kB xet	3 months ago	39a61236

README.md

AliveAI Medical RAG Chatbot

Fully local medical RAG chatbot built with FastAPI, ChromaDB, SentenceTransformers, and Ollama.

Stack

Knowledge-base embeddings: BAAI/bge-base-en-v1.5
NLP routing embeddings: sentence-transformers/all-MiniLM-L6-v2
Vector database: ChromaDB with cosine similarity
Optional vector backend: Pinecone (auto-selected when Pinecone env vars are set)
Local chat model: Ollama (llama3.2:3b by default, OpenMeditron/Meditron3-8B when available)
Dataset: keivalya/MedQuad-MedicalQnADataset

Project Layout

app/
  agent/
  db/
  nlp/
scripts/
data/
chroma_db/
test_rag.py

Setup

python3.12 -m pip install -r requirements.txt
ollama pull llama3.2:3b
python3.12 scripts/download_dataset.py
python3.12 scripts/download_models.py   # optional fallback if Hugging Face TLS fails in Python
python3.12 scripts/download_hf_chat_model.py  # optional local HF fallback for chat when Ollama is unavailable
python3.12 scripts/prepare_dataset.py
python3.12 scripts/ingest.py
python3.12 -c "from test_rag import run_all_tests; run_all_tests()"
python3.12 -m uvicorn app.main:app --reload --port 8000

Pinecone Switch

Set these env vars to use Pinecone instead of local ChromaDB:

export ALIVEAI_PINECONE_API_KEY=your_key
export ALIVEAI_PINECONE_INDEX_NAME=your_index_name
# optional if your key requires host-level targeting:
export ALIVEAI_PINECONE_INDEX_HOST=your_index_host
# optional to isolate AliveAI data inside a shared index:
export ALIVEAI_PINECONE_NAMESPACE=aliveai

When Pinecone vars are present, the DB adapter uses Pinecone automatically.
When they are not present, it falls back to local ChromaDB.

Pinecone index requirements for this project:

dimension=768
metric=cosine

After enabling Pinecone, run:

python3.12 scripts/ingest.py --force
python3.12 -c "from test_rag import run_all_tests; run_all_tests()"
python3.12 -m uvicorn app.main:app --reload --port 8000

Endpoints

POST /chat
GET /health
POST /reset/{session_id}
GET /validate?text1=chest+hurts&text2=chest+pain

Notes

All models are loaded lazily and instantiated as singletons.
Emergency intent is short-circuited before retrieval.
ChromaDB persists to the local chroma_db/ directory.
If Python cannot download Hugging Face assets in your environment, scripts/download_models.py mirrors the two SentenceTransformer model repos into models/.
HealthAgent tries Ollama first, then Hugging Face local generation, then extractive context fallback (ALIVEAI_CHAT_FALLBACK=auto, default).
If your default python3 is a different interpreter (for example Python 3.13 via conda), use python3.12 so installed dependencies match this project.

Total size: 973 MB

Files: 207

Last updated: Mar 30

Pre-warmed CDN: US EU US EU