File size: 5,821 Bytes
22fd41f
 
 
f3db8b6
 
22fd41f
 
 
 
26a5301
22fd41f
26a5301
56da115
26a5301
22fd41f
26a5301
22fd41f
26a5301
22fd41f
 
 
 
 
 
56da115
22fd41f
 
56da115
22fd41f
26a5301
22fd41f
26a5301
22fd41f
26a5301
22fd41f
 
 
 
26a5301
22fd41f
 
 
214ba79
22fd41f
26a5301
22fd41f
 
 
 
 
 
56da115
22fd41f
 
26a5301
22fd41f
 
 
 
 
 
 
 
 
26a5301
22fd41f
 
 
 
56da115
26a5301
22fd41f
56da115
26a5301
22fd41f
26a5301
22fd41f
26a5301
56da115
 
 
 
 
 
 
 
 
26a5301
22fd41f
26a5301
22fd41f
26a5301
22fd41f
 
 
 
56da115
 
 
 
 
 
 
22fd41f
 
56da115
22fd41f
 
56da115
 
 
 
22fd41f
 
214ba79
22fd41f
 
56da115
22fd41f
26a5301
22fd41f
26a5301
56da115
26a5301
56da115
26a5301
56da115
 
 
26a5301
56da115
 
26a5301
56da115
 
 
26a5301
56da115
 
 
 
 
26a5301
56da115
26a5301
56da115
26a5301
56da115
 
 
 
 
 
 
 
 
26a5301
56da115
 
 
 
 
 
 
26a5301
56da115
 
 
 
 
 
 
26a5301
56da115
26a5301
22fd41f
26a5301
22fd41f
56da115
22fd41f
56da115
 
26a5301
56da115
26a5301
56da115
26a5301
56da115
 
 
 
 
 
 
 
 
 
26a5301
56da115
26a5301
f1f031f
 
26a5301
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
---
title: Sacred Texts RAG
emoji: πŸ•ŠοΈ
colorFrom: yellow
colorTo: gray
sdk: docker
app_port: 7860
pinned: false
---

# πŸ•ŠοΈ Sacred Texts RAG β€” Multi-Religion Knowledge Base

A Retrieval-Augmented Generation (RAG) application that answers spiritual queries using the Bhagavad Gita, Quran, Bible, and Guru Granth Sahib as the sole knowledge sources. Now with **multi-turn conversation memory** β€” ask follow-up questions naturally, just like a real dialogue.

---

## πŸ“ Project Structure

```
sacred-texts-rag/
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env.example
β”œβ”€β”€ ingest.py               # Step 1: Load PDFs β†’ chunk β†’ embed β†’ store
β”œβ”€β”€ rag_chain.py            # Core RAG chain logic (with session memory)
β”œβ”€β”€ app.py                  # FastAPI backend server
└── frontend/
    └── index.html          # Chat UI (served by FastAPI)
```

---

## βš™οΈ Setup Instructions

### 1. Install Dependencies
```bash
pip install -r requirements.txt
```

### 2. Configure Environment
```bash
cp .env.example .env
# Edit .env and add your NVIDIA_API_KEY
```

### 3. Add Your PDF Books
Place your PDF files in a `books/` folder:
```
books/
β”œβ”€β”€ bhagavad_gita.pdf
β”œβ”€β”€ quran.pdf
β”œβ”€β”€ bible.pdf
└── guru_granth_sahib.pdf
```

### 4. Ingest the Books (Run Once)
```bash
python ingest.py
```
This will:
- Load and parse all PDFs
- Split into semantic chunks
- Create embeddings using NVIDIA's `llama-nemotron-embed-vl-1b-v2` model
- Store in a local ChromaDB vector store (`./chroma_db/`)

### 5. Start the Backend
```bash
python app.py
```
Server runs at: `http://localhost:7860`

### 6. Open the Frontend
Navigate to `http://localhost:7860` in your browser β€” the FastAPI server serves the UI directly.

---

## πŸ”‘ Environment Variables

| Variable | Description | Default |
|---|---|---|
| `NVIDIA_API_KEY` | Your NVIDIA API key | β€” |
| `CHROMA_DB_PATH` | Path to ChromaDB storage | `./chroma_db` |
| `COLLECTION_NAME` | ChromaDB collection name | `sacred_texts` |
| `CHUNKS_PER_BOOK` | Chunks retrieved per book per query | `3` |
| `MAX_HISTORY_TURNS` | Max conversation turns kept in memory per session | `6` |
| `HOST` | Server bind host | `0.0.0.0` |
| `PORT` | Server port | `7860` |

---

## 🧠 How It Works

```
User Query
    β”‚
    β–Ό
[Session Memory]  ←── Injects prior conversation turns into LLM context
    β”‚
    β–Ό
[Query Augmentation]  ←── Short follow-ups are enriched with previous question
    β”‚
    β–Ό
[Hybrid Retrieval: BM25 + Vector Search]  ←── Per-book guaranteed slots
    β”‚
    β–Ό
[NVIDIA Reranker]  ←── llama-3.2-nv-rerankqa-1b-v2 re-scores pooled candidates
    β”‚
    β–Ό
[Semantic Cache Check]  ←── Skip LLM if a similar question was answered before
    β”‚
    β–Ό
[Prompt with Context + History]
    β”‚
    β–Ό
[Llama-3.3-70b-instruct]  ←── Answer grounded ONLY in retrieved texts
    β”‚
    β–Ό
Streamed response with source citations (book + chapter/verse)
```

---

## πŸ’¬ Multi-Turn Conversation

The app maintains per-session conversation history so you can ask natural follow-up questions:

```
You:  "What do the scriptures say about forgiveness?"
AI:   [Answer citing Gita, Quran, Bible, Guru Granth Sahib]

You:  "Elaborate on the second point"       ← follow-up, no context needed
AI:   [Continues from previous answer]

You:  "What does the Bible say specifically?"  ← drill-down
AI:   [Focuses on Bible passages from the thread]
```

**How sessions work:**
- A session ID is created automatically on your first question and persisted in the browser's `localStorage`
- The server keeps the last `MAX_HISTORY_TURNS` (default: 6) human+AI pairs in memory
- Click **β†Ί New Conversation** in the header to clear history and start fresh
- Sessions are scoped to the server process β€” they reset on server restart

---

## 🌐 API Endpoints

| Method | Endpoint | Description |
|---|---|---|
| `POST` | `/ask` | Ask a question; streams NDJSON response |
| `POST` | `/clear` | Clear conversation history for a session |
| `GET` | `/history` | Inspect conversation history for a session |
| `GET` | `/books` | List all books indexed in the knowledge base |
| `GET` | `/health` | Health check |
| `GET` | `/` | Serves the frontend UI |
| `GET` | `/docs` | Swagger UI |

### `/ask` Request Body
```json
{
  "question": "What do the scriptures say about compassion?",
  "session_id": "optional-uuid-string"
}
```

### `/ask` Response (streamed NDJSON)
```json
{"type": "token",   "data": "The Bhagavad Gita teaches..."}
{"type": "token",   "data": " compassion as..."}
{"type": "sources", "data": [{"book": "Bhagavad Gita 2:47", "page": "2:47", "snippet": "..."}]}
```
Cache hits return a single `{"type": "cache", "data": {"answer": "...", "sources": [...]}}` line.

---

## πŸ“ Notes

- The LLM is instructed **never** to answer from outside the provided texts
- Each response includes **source citations** (book + chapter/verse where available)
- Responses synthesize wisdom **across all books** when relevant
- The semantic cache skips the LLM for repeated or near-identical questions (cosine distance < 0.35)
- Follow-up retrieval automatically augments vague short queries with the previous question for better semantic matching

---

## πŸ—ΊοΈ Planned Features

- Contextual chunk expansion (fetch Β±1 surrounding chunks)
- HyDE β€” Hypothetical Document Embedding for abstract queries
- Answer faithfulness scoring (LLM-as-judge)
- Query rewriting for vague inputs
- Snippet preview on source hover
- Query suggestions after each answer
- Compare mode β€” side-by-side view across books
- Hallucination guardrail
- Out-of-scope detection
- Rate limiting & API key hardening

---

## 🎬 Demo

App Link: https://shouvik99-lifeguide.hf.space/