Buckets:
| # AliveAI Medical RAG Chatbot | |
| Fully local medical RAG chatbot built with FastAPI, ChromaDB, SentenceTransformers, and Ollama. | |
| ## Stack | |
| - Knowledge-base embeddings: `BAAI/bge-base-en-v1.5` | |
| - NLP routing embeddings: `sentence-transformers/all-MiniLM-L6-v2` | |
| - Vector database: ChromaDB with cosine similarity | |
| - Optional vector backend: Pinecone (auto-selected when Pinecone env vars are set) | |
| - Local chat model: Ollama (`llama3.2:3b` by default, `OpenMeditron/Meditron3-8B` when available) | |
| - Dataset: `keivalya/MedQuad-MedicalQnADataset` | |
| ## Project Layout | |
| ```text | |
| app/ | |
| agent/ | |
| db/ | |
| nlp/ | |
| scripts/ | |
| data/ | |
| chroma_db/ | |
| test_rag.py | |
| ``` | |
| ## Setup | |
| ```bash | |
| python3.12 -m pip install -r requirements.txt | |
| ollama pull llama3.2:3b | |
| python3.12 scripts/download_dataset.py | |
| python3.12 scripts/download_models.py # optional fallback if Hugging Face TLS fails in Python | |
| python3.12 scripts/download_hf_chat_model.py # optional local HF fallback for chat when Ollama is unavailable | |
| python3.12 scripts/prepare_dataset.py | |
| python3.12 scripts/ingest.py | |
| python3.12 -c "from test_rag import run_all_tests; run_all_tests()" | |
| python3.12 -m uvicorn app.main:app --reload --port 8000 | |
| ``` | |
| ## Pinecone Switch | |
| Set these env vars to use Pinecone instead of local ChromaDB: | |
| ```bash | |
| export ALIVEAI_PINECONE_API_KEY=your_key | |
| export ALIVEAI_PINECONE_INDEX_NAME=your_index_name | |
| # optional if your key requires host-level targeting: | |
| export ALIVEAI_PINECONE_INDEX_HOST=your_index_host | |
| # optional to isolate AliveAI data inside a shared index: | |
| export ALIVEAI_PINECONE_NAMESPACE=aliveai | |
| ``` | |
| When Pinecone vars are present, the DB adapter uses Pinecone automatically. | |
| When they are not present, it falls back to local ChromaDB. | |
| Pinecone index requirements for this project: | |
| - `dimension=768` | |
| - `metric=cosine` | |
| After enabling Pinecone, run: | |
| ```bash | |
| python3.12 scripts/ingest.py --force | |
| python3.12 -c "from test_rag import run_all_tests; run_all_tests()" | |
| python3.12 -m uvicorn app.main:app --reload --port 8000 | |
| ``` | |
| ## Endpoints | |
| - `POST /chat` | |
| - `GET /health` | |
| - `POST /reset/{session_id}` | |
| - `GET /validate?text1=chest+hurts&text2=chest+pain` | |
| ## Notes | |
| - All models are loaded lazily and instantiated as singletons. | |
| - Emergency intent is short-circuited before retrieval. | |
| - ChromaDB persists to the local `chroma_db/` directory. | |
| - If Python cannot download Hugging Face assets in your environment, `scripts/download_models.py` mirrors the two SentenceTransformer model repos into `models/`. | |
| - `HealthAgent` tries Ollama first, then Hugging Face local generation, then extractive context fallback (`ALIVEAI_CHAT_FALLBACK=auto`, default). | |
| - If your default `python3` is a different interpreter (for example Python 3.13 via conda), use `python3.12` so installed dependencies match this project. | |
Xet Storage Details
- Size:
- 2.76 kB
- Xet hash:
- 64ba92db646f6971c93789364e38a6d09931e5c5aa8f096fccbb3feec81df500
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.