File size: 5,821 Bytes
22fd41f f3db8b6 22fd41f 26a5301 22fd41f 26a5301 56da115 26a5301 22fd41f 26a5301 22fd41f 26a5301 22fd41f 56da115 22fd41f 56da115 22fd41f 26a5301 22fd41f 26a5301 22fd41f 26a5301 22fd41f 26a5301 22fd41f 214ba79 22fd41f 26a5301 22fd41f 56da115 22fd41f 26a5301 22fd41f 26a5301 22fd41f 56da115 26a5301 22fd41f 56da115 26a5301 22fd41f 26a5301 22fd41f 26a5301 56da115 26a5301 22fd41f 26a5301 22fd41f 26a5301 22fd41f 56da115 22fd41f 56da115 22fd41f 56da115 22fd41f 214ba79 22fd41f 56da115 22fd41f 26a5301 22fd41f 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 22fd41f 26a5301 22fd41f 56da115 22fd41f 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 56da115 26a5301 f1f031f 26a5301 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 | ---
title: Sacred Texts RAG
emoji: ποΈ
colorFrom: yellow
colorTo: gray
sdk: docker
app_port: 7860
pinned: false
---
# ποΈ Sacred Texts RAG β Multi-Religion Knowledge Base
A Retrieval-Augmented Generation (RAG) application that answers spiritual queries using the Bhagavad Gita, Quran, Bible, and Guru Granth Sahib as the sole knowledge sources. Now with **multi-turn conversation memory** β ask follow-up questions naturally, just like a real dialogue.
---
## π Project Structure
```
sacred-texts-rag/
βββ README.md
βββ requirements.txt
βββ .env.example
βββ ingest.py # Step 1: Load PDFs β chunk β embed β store
βββ rag_chain.py # Core RAG chain logic (with session memory)
βββ app.py # FastAPI backend server
βββ frontend/
βββ index.html # Chat UI (served by FastAPI)
```
---
## βοΈ Setup Instructions
### 1. Install Dependencies
```bash
pip install -r requirements.txt
```
### 2. Configure Environment
```bash
cp .env.example .env
# Edit .env and add your NVIDIA_API_KEY
```
### 3. Add Your PDF Books
Place your PDF files in a `books/` folder:
```
books/
βββ bhagavad_gita.pdf
βββ quran.pdf
βββ bible.pdf
βββ guru_granth_sahib.pdf
```
### 4. Ingest the Books (Run Once)
```bash
python ingest.py
```
This will:
- Load and parse all PDFs
- Split into semantic chunks
- Create embeddings using NVIDIA's `llama-nemotron-embed-vl-1b-v2` model
- Store in a local ChromaDB vector store (`./chroma_db/`)
### 5. Start the Backend
```bash
python app.py
```
Server runs at: `http://localhost:7860`
### 6. Open the Frontend
Navigate to `http://localhost:7860` in your browser β the FastAPI server serves the UI directly.
---
## π Environment Variables
| Variable | Description | Default |
|---|---|---|
| `NVIDIA_API_KEY` | Your NVIDIA API key | β |
| `CHROMA_DB_PATH` | Path to ChromaDB storage | `./chroma_db` |
| `COLLECTION_NAME` | ChromaDB collection name | `sacred_texts` |
| `CHUNKS_PER_BOOK` | Chunks retrieved per book per query | `3` |
| `MAX_HISTORY_TURNS` | Max conversation turns kept in memory per session | `6` |
| `HOST` | Server bind host | `0.0.0.0` |
| `PORT` | Server port | `7860` |
---
## π§ How It Works
```
User Query
β
βΌ
[Session Memory] βββ Injects prior conversation turns into LLM context
β
βΌ
[Query Augmentation] βββ Short follow-ups are enriched with previous question
β
βΌ
[Hybrid Retrieval: BM25 + Vector Search] βββ Per-book guaranteed slots
β
βΌ
[NVIDIA Reranker] βββ llama-3.2-nv-rerankqa-1b-v2 re-scores pooled candidates
β
βΌ
[Semantic Cache Check] βββ Skip LLM if a similar question was answered before
β
βΌ
[Prompt with Context + History]
β
βΌ
[Llama-3.3-70b-instruct] βββ Answer grounded ONLY in retrieved texts
β
βΌ
Streamed response with source citations (book + chapter/verse)
```
---
## π¬ Multi-Turn Conversation
The app maintains per-session conversation history so you can ask natural follow-up questions:
```
You: "What do the scriptures say about forgiveness?"
AI: [Answer citing Gita, Quran, Bible, Guru Granth Sahib]
You: "Elaborate on the second point" β follow-up, no context needed
AI: [Continues from previous answer]
You: "What does the Bible say specifically?" β drill-down
AI: [Focuses on Bible passages from the thread]
```
**How sessions work:**
- A session ID is created automatically on your first question and persisted in the browser's `localStorage`
- The server keeps the last `MAX_HISTORY_TURNS` (default: 6) human+AI pairs in memory
- Click **βΊ New Conversation** in the header to clear history and start fresh
- Sessions are scoped to the server process β they reset on server restart
---
## π API Endpoints
| Method | Endpoint | Description |
|---|---|---|
| `POST` | `/ask` | Ask a question; streams NDJSON response |
| `POST` | `/clear` | Clear conversation history for a session |
| `GET` | `/history` | Inspect conversation history for a session |
| `GET` | `/books` | List all books indexed in the knowledge base |
| `GET` | `/health` | Health check |
| `GET` | `/` | Serves the frontend UI |
| `GET` | `/docs` | Swagger UI |
### `/ask` Request Body
```json
{
"question": "What do the scriptures say about compassion?",
"session_id": "optional-uuid-string"
}
```
### `/ask` Response (streamed NDJSON)
```json
{"type": "token", "data": "The Bhagavad Gita teaches..."}
{"type": "token", "data": " compassion as..."}
{"type": "sources", "data": [{"book": "Bhagavad Gita 2:47", "page": "2:47", "snippet": "..."}]}
```
Cache hits return a single `{"type": "cache", "data": {"answer": "...", "sources": [...]}}` line.
---
## π Notes
- The LLM is instructed **never** to answer from outside the provided texts
- Each response includes **source citations** (book + chapter/verse where available)
- Responses synthesize wisdom **across all books** when relevant
- The semantic cache skips the LLM for repeated or near-identical questions (cosine distance < 0.35)
- Follow-up retrieval automatically augments vague short queries with the previous question for better semantic matching
---
## πΊοΈ Planned Features
- Contextual chunk expansion (fetch Β±1 surrounding chunks)
- HyDE β Hypothetical Document Embedding for abstract queries
- Answer faithfulness scoring (LLM-as-judge)
- Query rewriting for vague inputs
- Snippet preview on source hover
- Query suggestions after each answer
- Compare mode β side-by-side view across books
- Hallucination guardrail
- Out-of-scope detection
- Rate limiting & API key hardening
---
## π¬ Demo
App Link: https://shouvik99-lifeguide.hf.space/ |