Spaces:
Sleeping
Sleeping
Deploy: auto-ingest, hybrid RAG, streaming, UMAP viz, 16 LLMs
Browse files- .env.example +3 -0
- .gitignore +9 -0
- MODEL_COMPARISON.md +121 -0
- README.md +122 -5
- app.py +522 -0
- config.py +89 -0
- ingest.py +157 -0
- rag_chain.py +353 -0
- requirements.txt +17 -0
.env.example
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
GOOGLE_API_KEY=your_google_ai_studio_key_here
|
| 2 |
+
GROQ_API_KEY=your_groq_key_here
|
| 3 |
+
OPENROUTER_API_KEY=your_openrouter_key_here
|
.gitignore
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
.env
|
| 2 |
+
__pycache__/
|
| 3 |
+
*.pyc
|
| 4 |
+
*.pyo
|
| 5 |
+
.DS_Store
|
| 6 |
+
data/
|
| 7 |
+
vectorstore/
|
| 8 |
+
*.log
|
| 9 |
+
.pytest_cache/
|
MODEL_COMPARISON.md
ADDED
|
@@ -0,0 +1,121 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# LLM Model Comparison
|
| 2 |
+
|
| 3 |
+
Benchmarked on: *"What is Nietzsche's view on nihilism and the will to power?"*
|
| 4 |
+
Setup: RTX 3060, EmbeddingGemma-300M on CUDA, ChromaDB (~5,700 chunks), `RETRIEVAL_K=5`.
|
| 5 |
+
|
| 6 |
+
Rate limits for Google verified directly from **aistudio.google.com/rate-limit** (May 2026).
|
| 7 |
+
Rate limits for Groq verified from **live API response headers** + console.groq.com/docs/rate-limits.
|
| 8 |
+
Rate limits for OpenRouter verified from **openrouter.ai/docs/guides/routing/model-variants/free**.
|
| 9 |
+
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
## Full Comparison Table
|
| 13 |
+
|
| 14 |
+
| Model | Provider | Latency | RPM | TPM | RPD | Notes |
|
| 15 |
+
|---|---|---|---|---|---|---|
|
| 16 |
+
| **Gemma 4 MoE 26B** | Google | ~65 s | 15 | ∞ | **1,500** | Best limits of any Google model; slow but deep |
|
| 17 |
+
| **Gemma 4 Dense 31B** | Google | ~25 s | 15 | ∞ | **1,500** | Same limits as MoE; faster, slightly less depth |
|
| 18 |
+
| **Gemini 3.1 Flash Lite** | Google | ~0.6 s | 15 | 250K | **500** | Newest Gemini, highest RPD among Flash models |
|
| 19 |
+
| **Gemini 3.5 Flash** | Google | ~0.8 s | 5 | 250K | 20 | Latest Gemini series; crisp reasoning |
|
| 20 |
+
| **Gemini 3 Flash** | Google | ~9 s | 5 | 250K | 20 | Solid baseline Gemini 3 |
|
| 21 |
+
| **Gemini 2.5 Flash** | Google | ~7 s | 5 | 250K | 20 | Previous generation; well-rounded |
|
| 22 |
+
| **Gemini 2.5 Flash Lite** | Google | ~2 s | 10 | 250K | 20 | Fastest 2.5; same 20 RPD as 2.5 Flash |
|
| 23 |
+
| **Llama 3.1 8B** | Groq | ~2 s | **14,400** | 6K | **14,400** | Highest throughput by far; limited depth |
|
| 24 |
+
| **Llama 4 Scout 17B** | Groq | ~1.5 s | 1,000 | 30K | 1,000 | Fastest quality model overall |
|
| 25 |
+
| **Llama 3.3 70B** | Groq | ~4.5 s | 1,000 | 12K | 1,000 | Best Groq quality; lower token quota |
|
| 26 |
+
| **Qwen3 32B** | Groq | ~5 s | 1,000 | 6K | 1,000 | Chain-of-thought; deepest Groq reasoning |
|
| 27 |
+
| **Nvidia Nemotron 120B** | OpenRouter | ~75 s | 20 | — | 50* | Exceptional philosophical depth; slow |
|
| 28 |
+
| **OpenAI OSS 120B** | OpenRouter | ~22 s | 20 | — | 50* | Strong quality; best free OR option |
|
| 29 |
+
| **DeepSeek V4 Flash** | OpenRouter | ~5 s† | 20 | — | 50* | 1M context window; fast when available |
|
| 30 |
+
| **Llama 3.3 70B** | OpenRouter | ~4 s† | 20 | — | 50* | Same weights as Groq; use Groq instead |
|
| 31 |
+
| **Qwen3 Next 80B** | OpenRouter | ~8 s† | 20 | — | 50* | Strong reasoning; frequently throttled |
|
| 32 |
+
| **Gemma 4 MoE 26B** | OpenRouter | ~5 s† | 20 | — | 50* | Same weights as Google version |
|
| 33 |
+
|
| 34 |
+
*50 RPD without account credits; 1,000 RPD with $10+ credit purchase
|
| 35 |
+
†Latency when not throttled; free-tier provider-side 429s are common during peak hours
|
| 36 |
+
— OpenRouter does not enforce token-based limits on free models
|
| 37 |
+
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
+
## Rate Limit Deep-Dive
|
| 41 |
+
|
| 42 |
+
### Google AI Studio — verified from aistudio.google.com/rate-limit
|
| 43 |
+
|
| 44 |
+
| Model | API Model ID | RPM | TPM | RPD |
|
| 45 |
+
|---|---|---|---|---|
|
| 46 |
+
| Gemma 4 MoE 26B | `gemma-4-26b-a4b-it` | 15 | **Unlimited** | **1,500** |
|
| 47 |
+
| Gemma 4 Dense 31B | `gemma-4-31b-it` | 15 | **Unlimited** | **1,500** |
|
| 48 |
+
| Gemini 3.1 Flash Lite | `gemini-3.1-flash-lite` | 15 | 250,000 | **500** |
|
| 49 |
+
| Gemini 3.5 Flash | `gemini-3.5-flash` | 5 | 250,000 | 20 |
|
| 50 |
+
| Gemini 3 Flash | `gemini-3-flash-preview` | 5 | 250,000 | 20 |
|
| 51 |
+
| Gemini 2.5 Flash | `gemini-2.5-flash` | 5 | 250,000 | 20 |
|
| 52 |
+
| Gemini 2.5 Flash Lite | `gemini-2.5-flash-lite` | 10 | 250,000 | 20 |
|
| 53 |
+
| ~~Gemini 2.5 Pro~~ | ~~`gemini-2.5-pro`~~ | 0 | 0 | 0 |
|
| 54 |
+
| ~~Gemini 2.0 Flash~~ | ~~`gemini-2.0-flash`~~ | 0 | 0 | 0 |
|
| 55 |
+
|
| 56 |
+
> **Key insight:** Gemma 4 models have *significantly better* limits than Gemini models — unlimited TPM and 1,500 RPD vs just 20 RPD for most Gemini Flash variants. Gemini 2.5 Pro and 2.0 Flash are completely locked (0/0/0) on this account's free tier.
|
| 57 |
+
|
| 58 |
+
---
|
| 59 |
+
|
| 60 |
+
### Groq — verified from live API headers + docs
|
| 61 |
+
|
| 62 |
+
| Model | API Model ID | RPM | TPM | RPD | TPD |
|
| 63 |
+
|---|---|---|---|---|---|
|
| 64 |
+
| Llama 3.1 8B instant | `llama-3.1-8b-instant` | **14,400** | 6,000 | **14,400** | 500,000 |
|
| 65 |
+
| Llama 3.3 70B versatile | `llama-3.3-70b-versatile` | 1,000 | 12,000 | 1,000 | 100,000 |
|
| 66 |
+
| Llama 4 Scout 17B | `meta-llama/llama-4-scout-17b-16e-instruct` | 1,000 | 30,000 | 1,000 | 500,000 |
|
| 67 |
+
| Qwen3 32B | `qwen/qwen3-32b` | 1,000 | 6,000 | 1,000 | 500,000 |
|
| 68 |
+
|
| 69 |
+
> **Key insight:** Groq is the most generous free tier for RAG use. Llama 3.1 8B has 14,400 RPD — useful for high-volume scenarios. Note that TPM limits (6K–30K) can be a bottleneck when RAG context is large; Llama 4 Scout has the most generous TPM at 30K.
|
| 70 |
+
|
| 71 |
+
---
|
| 72 |
+
|
| 73 |
+
### OpenRouter — all `:free` models share identical limits
|
| 74 |
+
|
| 75 |
+
| Metric | Without credits | With $10+ credits |
|
| 76 |
+
|---|---|---|
|
| 77 |
+
| RPM | 20 | 20 |
|
| 78 |
+
| RPD | **50** | 1,000 |
|
| 79 |
+
| TPM / TPD | Unlimited | Unlimited |
|
| 80 |
+
|
| 81 |
+
> **Key insight:** 50 RPD is exhausted extremely quickly — this explains the frequent 429 errors during testing. OpenRouter free tier is best for occasional access to very large models (120B+) not available elsewhere, not for regular daily use. Provider-side throttling from upstream (NVIDIA, DeepSeek, etc.) adds additional 429s beyond OpenRouter's own quota.
|
| 82 |
+
|
| 83 |
+
---
|
| 84 |
+
|
| 85 |
+
## Provider Verdict
|
| 86 |
+
|
| 87 |
+
| Provider | Best for | Main bottleneck |
|
| 88 |
+
|---|---|---|
|
| 89 |
+
| **Google (Gemma 4)** | Best free tier overall — high RPD + unlimited tokens | Slow inference (~25–65 s) |
|
| 90 |
+
| **Google (Gemini 3.1 Flash Lite)** | Best speed + reasonable daily quota | 500 RPD, 250K TPM |
|
| 91 |
+
| **Groq** | Fastest inference, high-volume use | TPM cap (6K–30K) limits long RAG contexts |
|
| 92 |
+
| **OpenRouter** | Accessing 120B+ models for free | 50 RPD hard cap, frequent provider throttling |
|
| 93 |
+
|
| 94 |
+
---
|
| 95 |
+
|
| 96 |
+
## Recommendations
|
| 97 |
+
|
| 98 |
+
| Use case | Best choice |
|
| 99 |
+
|---|---|
|
| 100 |
+
| Best overall (default) | **Gemma 4 MoE 26B [Google]** — best limits + quality |
|
| 101 |
+
| Fastest response | **Llama 4 Scout 17B [Groq]** — ~1.5 s |
|
| 102 |
+
| Fastest + high daily quota | **Gemini 3.1 Flash Lite [Google]** — 500 RPD, ~0.6 s |
|
| 103 |
+
| Deepest philosophical reasoning | **Qwen3 32B [Groq]** or **Llama 3.3 70B [Groq]** |
|
| 104 |
+
| Maximum context window | **DeepSeek V4 Flash [OR]** — 1M tokens |
|
| 105 |
+
| Highest model quality | **Nvidia Nemotron 120B [OR]** or **OpenAI OSS 120B [OR]** |
|
| 106 |
+
| High-volume / many requests/day | **Llama 3.1 8B [Groq]** — 14,400 RPD |
|
| 107 |
+
|
| 108 |
+
---
|
| 109 |
+
|
| 110 |
+
## Running the benchmark
|
| 111 |
+
|
| 112 |
+
```bash
|
| 113 |
+
python test_models.py
|
| 114 |
+
```
|
| 115 |
+
|
| 116 |
+
Requires `.env` with at least one key:
|
| 117 |
+
```
|
| 118 |
+
GOOGLE_API_KEY=... # aistudio.google.com
|
| 119 |
+
GROQ_API_KEY=... # console.groq.com
|
| 120 |
+
OPENROUTER_API_KEY=... # openrouter.ai
|
| 121 |
+
```
|
README.md
CHANGED
|
@@ -1,13 +1,130 @@
|
|
| 1 |
---
|
| 2 |
title: Philosopher Chat
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: 6.15.1
|
| 8 |
-
python_version: '3.13'
|
| 9 |
app_file: app.py
|
| 10 |
pinned: false
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
title: Philosopher Chat
|
| 3 |
+
emoji: 🏛️
|
| 4 |
+
colorFrom: purple
|
| 5 |
+
colorTo: indigo
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: 6.15.1
|
|
|
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
---
|
| 12 |
|
| 13 |
+
# Philosopher Chat
|
| 14 |
+
|
| 15 |
+
A RAG (Retrieval-Augmented Generation) chatbot grounded in Western philosophical primary texts.
|
| 16 |
+
Ask questions about nihilism, existentialism, epistemology, ethics, and more — answers are
|
| 17 |
+
cited directly from 12 primary texts (~5,700 chunks).
|
| 18 |
+
|
| 19 |
+
**Live demo:** [fikri0o0/philosopher-chat on HuggingFace Spaces](https://huggingface.co/spaces/fikri0o0/philosopher-chat)
|
| 20 |
+
|
| 21 |
+
---
|
| 22 |
+
|
| 23 |
+
## Features
|
| 24 |
+
|
| 25 |
+
| Feature | Detail |
|
| 26 |
+
|---|---|
|
| 27 |
+
| **Hybrid RAG** | BM25 + semantic cosine similarity ensemble |
|
| 28 |
+
| **Streaming** | Token-by-token via Google / Groq / OpenRouter |
|
| 29 |
+
| **16 LLMs** | Gemma 4, Gemini, Llama 4, Qwen3, DeepSeek, Nemotron — all free tier |
|
| 30 |
+
| **Think blocks** | Qwen3 / DeepSeek reasoning rendered as collapsible chains-of-thought |
|
| 31 |
+
| **UMAP viz** | 2D projection of all 5,700+ embeddings coloured by philosopher |
|
| 32 |
+
| **Model comparison** | Side-by-side latency + quality comparison across any two models |
|
| 33 |
+
| **Extendable KB** | Upload your own PDF/TXT to add new philosophers |
|
| 34 |
+
|
| 35 |
+
---
|
| 36 |
+
|
| 37 |
+
## Knowledge Base
|
| 38 |
+
|
| 39 |
+
| Philosopher | Works |
|
| 40 |
+
|---|---|
|
| 41 |
+
| Nietzsche | *Thus Spoke Zarathustra*, *Beyond Good and Evil*, *On the Genealogy of Morality*, *The Birth of Tragedy* |
|
| 42 |
+
| Schopenhauer | *Essays of Arthur Schopenhauer* |
|
| 43 |
+
| Hume | *An Enquiry Concerning Human Understanding* |
|
| 44 |
+
| Russell | *The Problems of Philosophy* |
|
| 45 |
+
| Marcus Aurelius | *Meditations* |
|
| 46 |
+
| Plato | *The Republic* |
|
| 47 |
+
| Mill | *Utilitarianism* |
|
| 48 |
+
| Epictetus | *The Enchiridion* |
|
| 49 |
+
| Kant | *Fundamental Principles of the Metaphysic of Morals* |
|
| 50 |
+
|
| 51 |
+
All texts are public domain, sourced from [Project Gutenberg](https://www.gutenberg.org).
|
| 52 |
+
|
| 53 |
+
---
|
| 54 |
+
|
| 55 |
+
## Tech Stack
|
| 56 |
+
|
| 57 |
+
| Layer | Tool |
|
| 58 |
+
|---|---|
|
| 59 |
+
| LLM routing | 16 models via Google AI Studio, Groq, OpenRouter (all free tier) |
|
| 60 |
+
| Embeddings | `google/embeddinggemma-300m` (HuggingFace, 768-dim) |
|
| 61 |
+
| Retrieval | Hybrid BM25 + ChromaDB semantic search |
|
| 62 |
+
| RAG Framework | LangChain LCEL (no chains, direct composition) |
|
| 63 |
+
| UI | Gradio 6 |
|
| 64 |
+
| Deployment | HuggingFace Spaces |
|
| 65 |
+
|
| 66 |
+
---
|
| 67 |
+
|
| 68 |
+
## Local Setup
|
| 69 |
+
|
| 70 |
+
### 1. Clone and install
|
| 71 |
+
|
| 72 |
+
```bash
|
| 73 |
+
git clone https://github.com/Fikri645/philosopher-chat
|
| 74 |
+
cd philosopher-chat
|
| 75 |
+
pip install -r requirements.txt
|
| 76 |
+
```
|
| 77 |
+
|
| 78 |
+
### 2. Set up API keys
|
| 79 |
+
|
| 80 |
+
```bash
|
| 81 |
+
# Create .env with your keys:
|
| 82 |
+
GOOGLE_API_KEY=... # https://ai.google.dev (free)
|
| 83 |
+
GROQ_API_KEY=... # https://console.groq.com (free)
|
| 84 |
+
OPENROUTER_API_KEY=... # https://openrouter.ai (free)
|
| 85 |
+
HF_TOKEN=... # https://huggingface.co/settings/tokens (for gated EmbeddingGemma)
|
| 86 |
+
```
|
| 87 |
+
|
| 88 |
+
### 3. Build the vectorstore (run once)
|
| 89 |
+
|
| 90 |
+
```bash
|
| 91 |
+
python ingest.py
|
| 92 |
+
```
|
| 93 |
+
|
| 94 |
+
Downloads 12 texts from Project Gutenberg, chunks them, embeds with EmbeddingGemma-300M,
|
| 95 |
+
and persists to `vectorstore/`. Takes ~5–10 min on first run (model download + embedding).
|
| 96 |
+
|
| 97 |
+
### 4. Run the app
|
| 98 |
+
|
| 99 |
+
```bash
|
| 100 |
+
python app.py
|
| 101 |
+
```
|
| 102 |
+
|
| 103 |
+
Open http://localhost:7860 in your browser.
|
| 104 |
+
|
| 105 |
+
---
|
| 106 |
+
|
| 107 |
+
## Deploying to HuggingFace Spaces
|
| 108 |
+
|
| 109 |
+
1. Fork or push to a new Space (SDK: **Gradio**)
|
| 110 |
+
2. In **Space Settings → Variables and Secrets**, add:
|
| 111 |
+
- `GOOGLE_API_KEY`
|
| 112 |
+
- `GROQ_API_KEY`
|
| 113 |
+
- `OPENROUTER_API_KEY`
|
| 114 |
+
- `HF_TOKEN` (your HF token — needed to download the gated EmbeddingGemma model)
|
| 115 |
+
3. On first boot the Space auto-ingests all 12 texts (~10 min); subsequent boots load the cached vectorstore.
|
| 116 |
+
|
| 117 |
+
---
|
| 118 |
+
|
| 119 |
+
## Project Structure
|
| 120 |
+
|
| 121 |
+
```
|
| 122 |
+
philosopher-chat/
|
| 123 |
+
├── app.py ← Gradio UI + event handlers
|
| 124 |
+
├── rag_chain.py ← LangChain RAG pipeline (retrieval + LLM routing)
|
| 125 |
+
├── ingest.py ← Data ingestion from Project Gutenberg
|
| 126 |
+
├── config.py ← LLM options, embedding model, RAG parameters
|
| 127 |
+
├── requirements.txt
|
| 128 |
+
├── .gitignore
|
| 129 |
+
└── README.md
|
| 130 |
+
```
|
app.py
ADDED
|
@@ -0,0 +1,522 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import re
|
| 2 |
+
import time
|
| 3 |
+
|
| 4 |
+
import gradio as gr
|
| 5 |
+
import plotly.express as px
|
| 6 |
+
import pandas as pd
|
| 7 |
+
|
| 8 |
+
from rag_chain import (
|
| 9 |
+
retrieve_docs, stream_llm, query, add_to_kb, vectorstore_exists,
|
| 10 |
+
get_all_philosophers, get_kb_stats, get_umap_data,
|
| 11 |
+
)
|
| 12 |
+
from config import LLM_OPTIONS, DEFAULT_LLM, EMBEDDING_OPTIONS, DEFAULT_EMBEDDING
|
| 13 |
+
|
| 14 |
+
# ---------------------------------------------------------------------------
|
| 15 |
+
# Display helpers
|
| 16 |
+
# ---------------------------------------------------------------------------
|
| 17 |
+
|
| 18 |
+
_PROVIDER_COLOR = {
|
| 19 |
+
"Google": "#4285F4",
|
| 20 |
+
"Groq": "#FF4B36",
|
| 21 |
+
"OpenRouter": "#6366F1",
|
| 22 |
+
}
|
| 23 |
+
|
| 24 |
+
_COMPARE_DEFAULT_B = "Llama 4 Scout 17B [Groq]"
|
| 25 |
+
|
| 26 |
+
_THINK_STYLE = (
|
| 27 |
+
"color:var(--body-text-color-subdued);font-size:0.88em;"
|
| 28 |
+
"border-left:3px solid var(--border-color-primary);padding-left:12px;margin:6px 0"
|
| 29 |
+
)
|
| 30 |
+
_SUMMARY_STYLE = (
|
| 31 |
+
"cursor:pointer;color:var(--body-text-color-subdued);"
|
| 32 |
+
"font-style:italic;user-select:none"
|
| 33 |
+
)
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
def _format_think_blocks(text: str) -> str:
|
| 37 |
+
"""Render <think>…</think> as collapsible, muted sections.
|
| 38 |
+
|
| 39 |
+
Mid-stream (</think> not yet seen): open <details> showing live reasoning.
|
| 40 |
+
Complete block: closed <details> with 'click to expand' label.
|
| 41 |
+
"""
|
| 42 |
+
if "<think>" not in text:
|
| 43 |
+
return text
|
| 44 |
+
|
| 45 |
+
if "</think>" not in text:
|
| 46 |
+
# Partial — think block still streaming
|
| 47 |
+
idx = text.index("<think>")
|
| 48 |
+
pre, thinking = text[:idx], text[idx + 7:]
|
| 49 |
+
return (
|
| 50 |
+
pre
|
| 51 |
+
+ f'<details open><summary style="{_SUMMARY_STYLE}">🤔 Thinking…</summary>'
|
| 52 |
+
+ f'<div style="{_THINK_STYLE}">{thinking}</div></details>'
|
| 53 |
+
)
|
| 54 |
+
|
| 55 |
+
def _wrap(m: re.Match) -> str:
|
| 56 |
+
content = m.group(1).strip()
|
| 57 |
+
return (
|
| 58 |
+
f'<details><summary style="{_SUMMARY_STYLE}">'
|
| 59 |
+
"🤔 Chain of thought (click to expand)</summary>"
|
| 60 |
+
f'<div style="{_THINK_STYLE}">{content}</div></details>\n\n'
|
| 61 |
+
)
|
| 62 |
+
|
| 63 |
+
return re.sub(r"<think>(.*?)</think>", _wrap, text, flags=re.DOTALL)
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
def _score_bar(score: float, width: int = 10) -> str:
|
| 67 |
+
filled = max(0, min(width, round(score * width)))
|
| 68 |
+
return "█" * filled + "░" * (width - filled)
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
def _format_sources(docs: list, scores: list[float]) -> str:
|
| 72 |
+
if not docs:
|
| 73 |
+
return ""
|
| 74 |
+
seen: set = set()
|
| 75 |
+
lines: list[str] = []
|
| 76 |
+
for doc, score in zip(docs, scores):
|
| 77 |
+
key = doc.metadata.get("source", "Unknown source")
|
| 78 |
+
if key not in seen:
|
| 79 |
+
seen.add(key)
|
| 80 |
+
tag = f"`{score:.2f}` " if score >= 0 else "`BM25` "
|
| 81 |
+
lines.append(f"- {tag}{key}")
|
| 82 |
+
return "\n\n---\n**Sources:**\n" + "\n".join(lines)
|
| 83 |
+
|
| 84 |
+
|
| 85 |
+
def _format_retrieved_chunks(docs: list, scores: list[float]) -> str:
|
| 86 |
+
if not docs:
|
| 87 |
+
return "_No chunks retrieved._"
|
| 88 |
+
|
| 89 |
+
semantic_scores = [s for s in scores if s >= 0]
|
| 90 |
+
avg = sum(semantic_scores) / len(semantic_scores) if semantic_scores else 0.0
|
| 91 |
+
has_bm25 = any(s < 0 for s in scores)
|
| 92 |
+
method = "Hybrid BM25 + Semantic" if has_bm25 else "Semantic"
|
| 93 |
+
|
| 94 |
+
lines = [
|
| 95 |
+
f"**{len(docs)} chunks** · {method}"
|
| 96 |
+
f" · avg similarity: **{avg:.3f}**\n"
|
| 97 |
+
]
|
| 98 |
+
for i, (doc, score) in enumerate(zip(docs, scores), 1):
|
| 99 |
+
phil = doc.metadata.get("philosopher", "?")
|
| 100 |
+
title = doc.metadata.get("title", "?")
|
| 101 |
+
if score >= 0:
|
| 102 |
+
tag = f"`{score:.3f}` {_score_bar(score)}"
|
| 103 |
+
else:
|
| 104 |
+
tag = "`BM25 ` ──────────"
|
| 105 |
+
text = doc.page_content[:200].replace("\n", " ").strip()
|
| 106 |
+
lines.append(
|
| 107 |
+
f"**{i}.** {tag} *{phil}* · {title} \n"
|
| 108 |
+
f" *\"{text}...\"*\n"
|
| 109 |
+
)
|
| 110 |
+
return "\n".join(lines)
|
| 111 |
+
|
| 112 |
+
|
| 113 |
+
def _format_metrics(
|
| 114 |
+
retrieve_s: float, llm_s: float, n_docs: int, n_sources: int
|
| 115 |
+
) -> str:
|
| 116 |
+
return (
|
| 117 |
+
f"⏱ Retrieval **{retrieve_s:.2f}s** · "
|
| 118 |
+
f"LLM **{llm_s:.2f}s** · "
|
| 119 |
+
f"Total **{retrieve_s + llm_s:.2f}s** · "
|
| 120 |
+
f"**{n_docs}** chunks from **{n_sources}** source(s)"
|
| 121 |
+
)
|
| 122 |
+
|
| 123 |
+
|
| 124 |
+
def _kb_markdown() -> str:
|
| 125 |
+
stats = get_kb_stats()
|
| 126 |
+
if not stats["total"]:
|
| 127 |
+
return "_Knowledge base is empty._"
|
| 128 |
+
lines = []
|
| 129 |
+
for phil in sorted(stats["sources"]):
|
| 130 |
+
lines.append(f"**{phil}**")
|
| 131 |
+
for title in sorted(stats["sources"][phil]):
|
| 132 |
+
lines.append(f" - *{title}*")
|
| 133 |
+
lines.append(f"\n_{stats['total']:,} total chunks_")
|
| 134 |
+
return "\n\n".join(lines)
|
| 135 |
+
|
| 136 |
+
|
| 137 |
+
# ---------------------------------------------------------------------------
|
| 138 |
+
# Event handlers
|
| 139 |
+
# ---------------------------------------------------------------------------
|
| 140 |
+
|
| 141 |
+
def respond_stream(message: str, history: list, philosopher: str, llm_label: str):
|
| 142 |
+
if not message.strip():
|
| 143 |
+
yield history, "", gr.update(), gr.update()
|
| 144 |
+
return
|
| 145 |
+
|
| 146 |
+
if not vectorstore_exists():
|
| 147 |
+
err = "Knowledge base not found. Run `python ingest.py` first."
|
| 148 |
+
yield history + [{"role": "assistant", "content": err}], "", gr.update(), gr.update()
|
| 149 |
+
return
|
| 150 |
+
|
| 151 |
+
# — Retrieval (fast, happens before streaming) —
|
| 152 |
+
t0 = time.perf_counter()
|
| 153 |
+
docs, scores = retrieve_docs(message, philosopher)
|
| 154 |
+
retrieve_time = time.perf_counter() - t0
|
| 155 |
+
context_str = "\n\n".join(d.page_content for d in docs)
|
| 156 |
+
|
| 157 |
+
chunks_md = _format_retrieved_chunks(docs, scores)
|
| 158 |
+
|
| 159 |
+
history = history + [
|
| 160 |
+
{"role": "user", "content": message},
|
| 161 |
+
{
|
| 162 |
+
"role": "assistant",
|
| 163 |
+
"content": (
|
| 164 |
+
"<em style='color:var(--body-text-color-subdued)'>"
|
| 165 |
+
"⏳ Retrieving context and generating response…"
|
| 166 |
+
"</em>"
|
| 167 |
+
),
|
| 168 |
+
},
|
| 169 |
+
]
|
| 170 |
+
# Show user bubble + loading message immediately
|
| 171 |
+
yield history, "", gr.update(value=chunks_md), gr.update()
|
| 172 |
+
|
| 173 |
+
provider, model_id = LLM_OPTIONS.get(llm_label, LLM_OPTIONS[DEFAULT_LLM])
|
| 174 |
+
t1 = time.perf_counter()
|
| 175 |
+
full_response = ""
|
| 176 |
+
try:
|
| 177 |
+
for text_chunk in stream_llm(provider, model_id, context_str, message):
|
| 178 |
+
full_response += text_chunk
|
| 179 |
+
history[-1]["content"] = _format_think_blocks(full_response)
|
| 180 |
+
yield history, "", gr.update(value=chunks_md), gr.update()
|
| 181 |
+
|
| 182 |
+
llm_time = time.perf_counter() - t1
|
| 183 |
+
unique_sources = len({d.metadata.get("source") for d in docs})
|
| 184 |
+
metrics_md = _format_metrics(retrieve_time, llm_time, len(docs), unique_sources)
|
| 185 |
+
|
| 186 |
+
history[-1]["content"] = (
|
| 187 |
+
_format_think_blocks(full_response) + _format_sources(docs, scores)
|
| 188 |
+
)
|
| 189 |
+
yield history, "", gr.update(value=chunks_md), gr.update(value=metrics_md)
|
| 190 |
+
|
| 191 |
+
except Exception as exc:
|
| 192 |
+
history[-1]["content"] = f"⚠️ **Error:** {exc}"
|
| 193 |
+
yield history, "", gr.update(value=chunks_md), gr.update()
|
| 194 |
+
|
| 195 |
+
|
| 196 |
+
def compare_respond(message: str, philosopher: str, llm_a: str, llm_b: str):
|
| 197 |
+
if not message.strip():
|
| 198 |
+
return "Enter a question above.", "", "Enter a question above.", ""
|
| 199 |
+
if not vectorstore_exists():
|
| 200 |
+
msg = "Knowledge base not found."
|
| 201 |
+
return msg, "", msg, ""
|
| 202 |
+
|
| 203 |
+
def _run(llm_label: str) -> tuple[str, str]:
|
| 204 |
+
t0 = time.perf_counter()
|
| 205 |
+
result = query(message, philosopher, llm_label)
|
| 206 |
+
elapsed = time.perf_counter() - t0
|
| 207 |
+
n_src = len({d.metadata.get("source") for d in result["context"]})
|
| 208 |
+
sem_scores = [s for s in result["scores"] if s >= 0]
|
| 209 |
+
avg = sum(sem_scores) / len(sem_scores) if sem_scores else 0.0
|
| 210 |
+
metrics = (
|
| 211 |
+
f"⏱ **{elapsed:.2f}s** · "
|
| 212 |
+
f"**{len(result['context'])}** chunks from **{n_src}** source(s)"
|
| 213 |
+
f" · avg similarity **{avg:.3f}**"
|
| 214 |
+
)
|
| 215 |
+
return result["answer"], metrics
|
| 216 |
+
|
| 217 |
+
ans_a, met_a = _run(llm_a)
|
| 218 |
+
ans_b, met_b = _run(llm_b)
|
| 219 |
+
return ans_a, met_a, ans_b, met_b
|
| 220 |
+
|
| 221 |
+
|
| 222 |
+
def upload_source(file, author: str, title: str):
|
| 223 |
+
if file is None:
|
| 224 |
+
return gr.update(value="Please upload a file first."), gr.update()
|
| 225 |
+
if not author.strip() or not title.strip():
|
| 226 |
+
return gr.update(value="Please fill in both Author and Title."), gr.update()
|
| 227 |
+
try:
|
| 228 |
+
n = add_to_kb(file, author.strip(), title.strip())
|
| 229 |
+
msg = f"Added {n:,} chunks from *{title}* by {author}."
|
| 230 |
+
except Exception as e:
|
| 231 |
+
msg = f"Error: {e}"
|
| 232 |
+
return (
|
| 233 |
+
gr.update(value=msg),
|
| 234 |
+
gr.update(choices=get_all_philosophers(), value="All"),
|
| 235 |
+
)
|
| 236 |
+
|
| 237 |
+
|
| 238 |
+
def refresh_kb():
|
| 239 |
+
return gr.update(value=_kb_markdown())
|
| 240 |
+
|
| 241 |
+
|
| 242 |
+
def build_umap_plot():
|
| 243 |
+
data = get_umap_data()
|
| 244 |
+
if data is None:
|
| 245 |
+
return None
|
| 246 |
+
df = pd.DataFrame(data)
|
| 247 |
+
fig = px.scatter(
|
| 248 |
+
df, x="x", y="y",
|
| 249 |
+
color="philosopher",
|
| 250 |
+
hover_data={"title": True, "preview": True, "x": False, "y": False},
|
| 251 |
+
title="Knowledge Base — Semantic Embedding Space (UMAP 2D)",
|
| 252 |
+
labels={"x": "UMAP-1", "y": "UMAP-2"},
|
| 253 |
+
opacity=0.75,
|
| 254 |
+
template="plotly_dark",
|
| 255 |
+
color_discrete_sequence=px.colors.qualitative.Bold,
|
| 256 |
+
)
|
| 257 |
+
fig.update_traces(marker=dict(size=5))
|
| 258 |
+
fig.update_layout(
|
| 259 |
+
height=540,
|
| 260 |
+
plot_bgcolor="rgba(0,0,0,0)",
|
| 261 |
+
paper_bgcolor="rgba(0,0,0,0)",
|
| 262 |
+
title_font=dict(size=14),
|
| 263 |
+
font=dict(color="rgba(220,220,220,0.9)"),
|
| 264 |
+
legend=dict(
|
| 265 |
+
title_text="",
|
| 266 |
+
yanchor="top", y=0.99, xanchor="left", x=0.01,
|
| 267 |
+
bgcolor="rgba(20,20,20,0.5)",
|
| 268 |
+
bordercolor="rgba(255,255,255,0.12)",
|
| 269 |
+
borderwidth=1,
|
| 270 |
+
),
|
| 271 |
+
xaxis=dict(gridcolor="rgba(255,255,255,0.07)", zeroline=False),
|
| 272 |
+
yaxis=dict(gridcolor="rgba(255,255,255,0.07)", zeroline=False),
|
| 273 |
+
margin=dict(l=40, r=20, t=48, b=36),
|
| 274 |
+
)
|
| 275 |
+
return fig
|
| 276 |
+
|
| 277 |
+
|
| 278 |
+
# ---------------------------------------------------------------------------
|
| 279 |
+
# UI
|
| 280 |
+
# ---------------------------------------------------------------------------
|
| 281 |
+
|
| 282 |
+
EXAMPLE_QUESTIONS = [
|
| 283 |
+
"What is Nietzsche's view on nihilism and the death of God?",
|
| 284 |
+
"How does Schopenhauer view suffering and the will to live?",
|
| 285 |
+
"What does Hume say about causality and the limits of reason?",
|
| 286 |
+
"Can we have certain knowledge of the external world?",
|
| 287 |
+
"Is morality objective or invented?",
|
| 288 |
+
"Explain the concept of Eternal Return",
|
| 289 |
+
"How does Marcus Aurelius advise dealing with suffering?",
|
| 290 |
+
"What is Plato's ideal society in The Republic?",
|
| 291 |
+
"Compare Schopenhauer and Nietzsche on the will",
|
| 292 |
+
"What is Kant's categorical imperative?",
|
| 293 |
+
"How does Mill justify utilitarianism?",
|
| 294 |
+
"What does Epictetus say about what is in our control?",
|
| 295 |
+
]
|
| 296 |
+
|
| 297 |
+
CSS = """
|
| 298 |
+
footer { display: none !important; }
|
| 299 |
+
.section-label {
|
| 300 |
+
font-size: 0.78rem; font-weight: 700; letter-spacing: 0.07em;
|
| 301 |
+
text-transform: uppercase; color: var(--body-text-color-subdued);
|
| 302 |
+
margin-bottom: 2px;
|
| 303 |
+
}
|
| 304 |
+
.metric-bar p { font-size: 0.82rem; color: var(--body-text-color-subdued); margin: 4px 0; }
|
| 305 |
+
.status-box textarea { font-size: 0.82rem !important; }
|
| 306 |
+
|
| 307 |
+
/* Fix double scrollbar: prevent inner message wrappers from scrolling */
|
| 308 |
+
.chatbot .overflow-y-auto { scrollbar-width: thin; scrollbar-color: var(--border-color-primary) transparent; }
|
| 309 |
+
.chatbot .message-wrap { overflow: visible !important; }
|
| 310 |
+
.chatbot .message-wrap > div { overflow: visible !important; max-height: none !important; }
|
| 311 |
+
/* Prevent long markdown lines from adding a horizontal inner scroll */
|
| 312 |
+
.chatbot .prose { overflow-x: hidden !important; overflow-wrap: break-word; word-break: break-word; }
|
| 313 |
+
"""
|
| 314 |
+
|
| 315 |
+
with gr.Blocks(title="Philosopher Chat") as demo:
|
| 316 |
+
|
| 317 |
+
gr.Markdown(
|
| 318 |
+
"""
|
| 319 |
+
# 📚 Philosopher Chat
|
| 320 |
+
**RAG chatbot grounded in Western philosophical primary texts**
|
| 321 |
+
|
| 322 |
+
Hybrid BM25 + Semantic retrieval · Real-time streaming
|
| 323 |
+
· Multi-provider LLM routing · 12 primary texts · ~5 700 chunks
|
| 324 |
+
"""
|
| 325 |
+
)
|
| 326 |
+
|
| 327 |
+
with gr.Tabs():
|
| 328 |
+
|
| 329 |
+
# ── Tab 1 ─ Chat ─────────────────────────────────────────────────
|
| 330 |
+
with gr.Tab("💬 Chat"):
|
| 331 |
+
with gr.Row(equal_height=False):
|
| 332 |
+
|
| 333 |
+
# Left: chat area
|
| 334 |
+
with gr.Column(scale=3):
|
| 335 |
+
chatbot_ui = gr.Chatbot(
|
| 336 |
+
height=480,
|
| 337 |
+
show_label=False,
|
| 338 |
+
placeholder="*Ask a philosophical question to get started...*",
|
| 339 |
+
)
|
| 340 |
+
msg_input = gr.Textbox(
|
| 341 |
+
placeholder="Ask a philosophical question…",
|
| 342 |
+
show_label=False,
|
| 343 |
+
autofocus=True,
|
| 344 |
+
submit_btn=True,
|
| 345 |
+
)
|
| 346 |
+
metrics_display = gr.Markdown(
|
| 347 |
+
value="", elem_classes="metric-bar"
|
| 348 |
+
)
|
| 349 |
+
with gr.Accordion("📄 Retrieved Chunks & Scores", open=False):
|
| 350 |
+
retrieved_display = gr.Markdown(
|
| 351 |
+
value="_Submit a question to see retrieved context._"
|
| 352 |
+
)
|
| 353 |
+
with gr.Accordion("💡 Example Questions", open=False):
|
| 354 |
+
gr.Examples(
|
| 355 |
+
examples=[[q] for q in EXAMPLE_QUESTIONS],
|
| 356 |
+
inputs=[msg_input],
|
| 357 |
+
label=None,
|
| 358 |
+
)
|
| 359 |
+
|
| 360 |
+
# Right: settings sidebar
|
| 361 |
+
with gr.Column(scale=1, min_width=240):
|
| 362 |
+
with gr.Group():
|
| 363 |
+
gr.Markdown("**⚙️ Chat Settings**", elem_classes="section-label")
|
| 364 |
+
llm_dropdown = gr.Dropdown(
|
| 365 |
+
choices=list(LLM_OPTIONS.keys()),
|
| 366 |
+
value=DEFAULT_LLM,
|
| 367 |
+
label="LLM Model",
|
| 368 |
+
)
|
| 369 |
+
embedding_display = gr.Dropdown(
|
| 370 |
+
choices=list(EMBEDDING_OPTIONS.keys()),
|
| 371 |
+
value=DEFAULT_EMBEDDING,
|
| 372 |
+
label="Embedding Model",
|
| 373 |
+
info="Change requires rebuilding index (ingest.py)",
|
| 374 |
+
interactive=False,
|
| 375 |
+
)
|
| 376 |
+
philosopher_filter = gr.Dropdown(
|
| 377 |
+
choices=get_all_philosophers(),
|
| 378 |
+
value="All",
|
| 379 |
+
label="Filter by Philosopher",
|
| 380 |
+
)
|
| 381 |
+
|
| 382 |
+
with gr.Group():
|
| 383 |
+
gr.Markdown("**ℹ️ Stack**", elem_classes="section-label")
|
| 384 |
+
gr.Markdown(
|
| 385 |
+
"- Retrieval: **Hybrid BM25 + Semantic**\n"
|
| 386 |
+
"- Embeddings: **EmbeddingGemma-300M**\n"
|
| 387 |
+
"- Vector DB: **ChromaDB**\n"
|
| 388 |
+
"- Framework: **LangChain LCEL**\n"
|
| 389 |
+
"- UI: **Gradio 6**"
|
| 390 |
+
)
|
| 391 |
+
|
| 392 |
+
# ── Tab 2 ─ Compare Models ───────────��───────────────────────────
|
| 393 |
+
with gr.Tab("⚖️ Compare Models"):
|
| 394 |
+
gr.Markdown(
|
| 395 |
+
"Run the same question through two models and compare quality, "
|
| 396 |
+
"latency, and retrieval coverage side by side."
|
| 397 |
+
)
|
| 398 |
+
with gr.Row():
|
| 399 |
+
compare_input = gr.Textbox(
|
| 400 |
+
label="Question",
|
| 401 |
+
placeholder="Ask a philosophical question…",
|
| 402 |
+
scale=4,
|
| 403 |
+
)
|
| 404 |
+
compare_philosopher = gr.Dropdown(
|
| 405 |
+
choices=get_all_philosophers(),
|
| 406 |
+
value="All",
|
| 407 |
+
label="Philosopher Filter",
|
| 408 |
+
scale=1,
|
| 409 |
+
)
|
| 410 |
+
compare_btn = gr.Button("▶ Compare", variant="primary")
|
| 411 |
+
|
| 412 |
+
with gr.Row():
|
| 413 |
+
with gr.Column():
|
| 414 |
+
model_a = gr.Dropdown(
|
| 415 |
+
choices=list(LLM_OPTIONS.keys()),
|
| 416 |
+
value=DEFAULT_LLM,
|
| 417 |
+
label="Model A",
|
| 418 |
+
)
|
| 419 |
+
response_a = gr.Markdown(label="Response A")
|
| 420 |
+
metrics_a = gr.Markdown(elem_classes="metric-bar")
|
| 421 |
+
|
| 422 |
+
with gr.Column():
|
| 423 |
+
model_b = gr.Dropdown(
|
| 424 |
+
choices=list(LLM_OPTIONS.keys()),
|
| 425 |
+
value=_COMPARE_DEFAULT_B,
|
| 426 |
+
label="Model B",
|
| 427 |
+
)
|
| 428 |
+
response_b = gr.Markdown(label="Response B")
|
| 429 |
+
metrics_b = gr.Markdown(elem_classes="metric-bar")
|
| 430 |
+
|
| 431 |
+
# ── Tab 3 ─ Knowledge Base ───────────────────────────────────────
|
| 432 |
+
with gr.Tab("🗺️ Knowledge Base"):
|
| 433 |
+
with gr.Row(equal_height=False):
|
| 434 |
+
|
| 435 |
+
# Left: UMAP visualization
|
| 436 |
+
with gr.Column(scale=2):
|
| 437 |
+
gr.Markdown(
|
| 438 |
+
"**Semantic Embedding Space** \n"
|
| 439 |
+
"Each point is one text chunk. Clusters indicate semantic similarity — "
|
| 440 |
+
"nearby chunks share philosophical themes regardless of source."
|
| 441 |
+
)
|
| 442 |
+
umap_plot = gr.Plot()
|
| 443 |
+
umap_btn = gr.Button(
|
| 444 |
+
"Generate Embedding Visualization", variant="secondary"
|
| 445 |
+
)
|
| 446 |
+
gr.Markdown(
|
| 447 |
+
"_UMAP projects ~5,700 × 768-dim embeddings to 2D. "
|
| 448 |
+
"First run takes ~1–2 min on CPU._"
|
| 449 |
+
)
|
| 450 |
+
|
| 451 |
+
# Right: stats + upload
|
| 452 |
+
with gr.Column(scale=1, min_width=280):
|
| 453 |
+
with gr.Group():
|
| 454 |
+
with gr.Row():
|
| 455 |
+
gr.Markdown(
|
| 456 |
+
"**📚 Knowledge Base**", elem_classes="section-label"
|
| 457 |
+
)
|
| 458 |
+
refresh_kb_btn = gr.Button("↻", size="sm", min_width=32)
|
| 459 |
+
kb_display = gr.Markdown(_kb_markdown())
|
| 460 |
+
|
| 461 |
+
with gr.Group():
|
| 462 |
+
gr.Markdown(
|
| 463 |
+
"**📤 Add Source**", elem_classes="section-label"
|
| 464 |
+
)
|
| 465 |
+
file_upload = gr.File(
|
| 466 |
+
label="Upload PDF or TXT",
|
| 467 |
+
file_types=[".pdf", ".txt"],
|
| 468 |
+
)
|
| 469 |
+
with gr.Row():
|
| 470 |
+
author_input = gr.Textbox(label="Author", scale=1)
|
| 471 |
+
title_input = gr.Textbox(label="Title", scale=1)
|
| 472 |
+
upload_btn = gr.Button(
|
| 473 |
+
"Add to Knowledge Base", variant="secondary", size="sm"
|
| 474 |
+
)
|
| 475 |
+
upload_status = gr.Textbox(
|
| 476 |
+
show_label=False,
|
| 477 |
+
interactive=False,
|
| 478 |
+
placeholder="Upload status will appear here…",
|
| 479 |
+
elem_classes="status-box",
|
| 480 |
+
)
|
| 481 |
+
|
| 482 |
+
# ── Event wiring ─────────────────────────────────────────────────────
|
| 483 |
+
|
| 484 |
+
msg_input.submit(
|
| 485 |
+
respond_stream,
|
| 486 |
+
inputs=[msg_input, chatbot_ui, philosopher_filter, llm_dropdown],
|
| 487 |
+
outputs=[chatbot_ui, msg_input, retrieved_display, metrics_display],
|
| 488 |
+
)
|
| 489 |
+
|
| 490 |
+
compare_btn.click(
|
| 491 |
+
compare_respond,
|
| 492 |
+
inputs=[compare_input, compare_philosopher, model_a, model_b],
|
| 493 |
+
outputs=[response_a, metrics_a, response_b, metrics_b],
|
| 494 |
+
)
|
| 495 |
+
|
| 496 |
+
umap_btn.click(build_umap_plot, outputs=umap_plot)
|
| 497 |
+
|
| 498 |
+
refresh_kb_btn.click(refresh_kb, outputs=kb_display)
|
| 499 |
+
|
| 500 |
+
upload_btn.click(
|
| 501 |
+
upload_source,
|
| 502 |
+
inputs=[file_upload, author_input, title_input],
|
| 503 |
+
outputs=[upload_status, philosopher_filter],
|
| 504 |
+
).then(refresh_kb, outputs=kb_display)
|
| 505 |
+
|
| 506 |
+
|
| 507 |
+
def _auto_ingest() -> None:
|
| 508 |
+
"""Build the vectorstore automatically on first Spaces run."""
|
| 509 |
+
if not vectorstore_exists():
|
| 510 |
+
print("[startup] Vectorstore missing — running initial ingest (this takes ~10 min)…")
|
| 511 |
+
try:
|
| 512 |
+
import ingest
|
| 513 |
+
ingest.main()
|
| 514 |
+
print("[startup] Ingest complete.")
|
| 515 |
+
except Exception as exc:
|
| 516 |
+
print(f"[startup] Ingest failed: {exc}")
|
| 517 |
+
|
| 518 |
+
|
| 519 |
+
_auto_ingest()
|
| 520 |
+
|
| 521 |
+
if __name__ == "__main__":
|
| 522 |
+
demo.launch(css=CSS)
|
config.py
ADDED
|
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import torch
|
| 3 |
+
from pathlib import Path
|
| 4 |
+
from dotenv import load_dotenv
|
| 5 |
+
|
| 6 |
+
load_dotenv()
|
| 7 |
+
|
| 8 |
+
PROJECT_ROOT = Path(__file__).parent
|
| 9 |
+
DATA_DIR = PROJECT_ROOT / "data" / "texts"
|
| 10 |
+
VECTORSTORE_DIR = PROJECT_ROOT / "vectorstore"
|
| 11 |
+
|
| 12 |
+
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY", "")
|
| 13 |
+
GROQ_API_KEY = os.getenv("GROQ_API_KEY", "")
|
| 14 |
+
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY", "")
|
| 15 |
+
|
| 16 |
+
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
|
| 17 |
+
|
| 18 |
+
# ---------------------------------------------------------------------------
|
| 19 |
+
# LLM options — (provider, model_id)
|
| 20 |
+
# Providers: "google" | "groq" | "openrouter"
|
| 21 |
+
# ---------------------------------------------------------------------------
|
| 22 |
+
LLM_OPTIONS: dict[str, tuple[str, str]] = {
|
| 23 |
+
# ── Google AI Studio (free tier) ──────────────────────────────────────
|
| 24 |
+
# Limits verified from aistudio.google.com/rate-limit (2026-05)
|
| 25 |
+
"Gemma 4 MoE 26B [Google]": ("google", "gemma-4-26b-a4b-it"), # 15 RPM | ∞ TPM | 1500 RPD
|
| 26 |
+
"Gemma 4 Dense 31B [Google]": ("google", "gemma-4-31b-it"), # 15 RPM | ∞ TPM | 1500 RPD
|
| 27 |
+
"Gemini 3.1 Flash Lite [Google]": ("google", "gemini-3.1-flash-lite"), # 15 RPM | 250K TPM | 500 RPD
|
| 28 |
+
"Gemini 3.5 Flash [Google]": ("google", "gemini-3.5-flash"), # 5 RPM | 250K TPM | 20 RPD
|
| 29 |
+
"Gemini 3 Flash [Google]": ("google", "gemini-3-flash-preview"), # 5 RPM | 250K TPM | 20 RPD
|
| 30 |
+
"Gemini 2.5 Flash [Google]": ("google", "gemini-2.5-flash"), # 5 RPM | 250K TPM | 20 RPD
|
| 31 |
+
"Gemini 2.5 Flash Lite [Google]": ("google", "gemini-2.5-flash-lite"), # 10 RPM | 250K TPM | 20 RPD
|
| 32 |
+
# ── Groq (free tier, very fast LPU inference) ─────────────────────────
|
| 33 |
+
"Llama 3.3 70B [Groq]": ("groq", "llama-3.3-70b-versatile"),
|
| 34 |
+
"Llama 4 Scout 17B [Groq]": ("groq", "meta-llama/llama-4-scout-17b-16e-instruct"),
|
| 35 |
+
"Qwen3 32B [Groq]": ("groq", "qwen/qwen3-32b"),
|
| 36 |
+
"Llama 3.1 8B [Groq]": ("groq", "llama-3.1-8b-instant"),
|
| 37 |
+
# ── OpenRouter free models (:free = no cost, rate-limited) ────────────
|
| 38 |
+
"Nvidia Nemotron 120B [OpenRouter]":("openrouter", "nvidia/nemotron-3-super-120b-a12b:free"),
|
| 39 |
+
"OpenAI OSS 120B [OpenRouter]": ("openrouter", "openai/gpt-oss-120b:free"),
|
| 40 |
+
"DeepSeek V4 Flash [OpenRouter]": ("openrouter", "deepseek/deepseek-v4-flash:free"),
|
| 41 |
+
"Llama 3.3 70B [OpenRouter]": ("openrouter", "meta-llama/llama-3.3-70b-instruct:free"),
|
| 42 |
+
"Qwen3 Next 80B [OpenRouter]": ("openrouter", "qwen/qwen3-next-80b-a3b-instruct:free"),
|
| 43 |
+
"Gemma 4 MoE 26B [OpenRouter]": ("openrouter", "google/gemma-4-26b-a4b-it:free"),
|
| 44 |
+
}
|
| 45 |
+
|
| 46 |
+
DEFAULT_LLM = "Gemma 4 MoE 26B [Google]"
|
| 47 |
+
|
| 48 |
+
PROVIDER_KEYS = {
|
| 49 |
+
"google": ("GOOGLE_API_KEY", "ai.google.dev"),
|
| 50 |
+
"groq": ("GROQ_API_KEY", "console.groq.com"),
|
| 51 |
+
"openrouter": ("OPENROUTER_API_KEY", "openrouter.ai"),
|
| 52 |
+
}
|
| 53 |
+
|
| 54 |
+
# ---------------------------------------------------------------------------
|
| 55 |
+
# Embedding
|
| 56 |
+
# ---------------------------------------------------------------------------
|
| 57 |
+
EMBEDDING_OPTIONS = {
|
| 58 |
+
"EmbeddingGemma 300M (active)": "google/embeddinggemma-300m",
|
| 59 |
+
"BGE Large EN v1.5": "BAAI/bge-large-en-v1.5",
|
| 60 |
+
"Multilingual E5 Large": "intfloat/multilingual-e5-large",
|
| 61 |
+
}
|
| 62 |
+
DEFAULT_EMBEDDING = "EmbeddingGemma 300M (active)"
|
| 63 |
+
EMBEDDING_MODEL = EMBEDDING_OPTIONS[DEFAULT_EMBEDDING]
|
| 64 |
+
|
| 65 |
+
# ---------------------------------------------------------------------------
|
| 66 |
+
# RAG
|
| 67 |
+
# ---------------------------------------------------------------------------
|
| 68 |
+
CHUNK_SIZE = 1000
|
| 69 |
+
CHUNK_OVERLAP = 150
|
| 70 |
+
RETRIEVAL_K = 6 # slightly more to absorb BM25 extras
|
| 71 |
+
USE_HYBRID_SEARCH = True # BM25 + semantic ensemble
|
| 72 |
+
|
| 73 |
+
# ---------------------------------------------------------------------------
|
| 74 |
+
# Knowledge base sources (Project Gutenberg)
|
| 75 |
+
# ---------------------------------------------------------------------------
|
| 76 |
+
SOURCES = [
|
| 77 |
+
{"philosopher": "Nietzsche", "title": "Thus Spoke Zarathustra", "gutenberg_id": 1998},
|
| 78 |
+
{"philosopher": "Nietzsche", "title": "Beyond Good and Evil", "gutenberg_id": 4363},
|
| 79 |
+
{"philosopher": "Nietzsche", "title": "On the Genealogy of Morality", "gutenberg_id": 52319},
|
| 80 |
+
{"philosopher": "Nietzsche", "title": "The Birth of Tragedy", "gutenberg_id": 51356},
|
| 81 |
+
{"philosopher": "Schopenhauer", "title": "Essays of Arthur Schopenhauer", "gutenberg_id": 11945},
|
| 82 |
+
{"philosopher": "Hume", "title": "An Enquiry Concerning Human Understanding", "gutenberg_id": 9662},
|
| 83 |
+
{"philosopher": "Russell", "title": "The Problems of Philosophy", "gutenberg_id": 5827},
|
| 84 |
+
{"philosopher": "Marcus Aurelius", "title": "Meditations", "gutenberg_id": 2680},
|
| 85 |
+
{"philosopher": "Plato", "title": "The Republic", "gutenberg_id": 1497},
|
| 86 |
+
{"philosopher": "Mill", "title": "Utilitarianism", "gutenberg_id": 11224},
|
| 87 |
+
{"philosopher": "Epictetus", "title": "The Enchiridion", "gutenberg_id": 45109},
|
| 88 |
+
{"philosopher": "Kant", "title": "Fundamental Principles of the Metaphysic of Morals", "gutenberg_id": 5682},
|
| 89 |
+
]
|
ingest.py
ADDED
|
@@ -0,0 +1,157 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Build or update the ChromaDB vectorstore from philosophical texts.
|
| 3 |
+
|
| 4 |
+
python ingest.py # incremental: skips already-indexed sources
|
| 5 |
+
python ingest.py --rebuild # wipes and rebuilds from scratch
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
import sys
|
| 9 |
+
import time
|
| 10 |
+
import requests
|
| 11 |
+
from langchain_text_splitters import RecursiveCharacterTextSplitter
|
| 12 |
+
from langchain_core.documents import Document
|
| 13 |
+
from langchain_huggingface import HuggingFaceEmbeddings
|
| 14 |
+
from langchain_chroma import Chroma
|
| 15 |
+
from config import (
|
| 16 |
+
DATA_DIR, VECTORSTORE_DIR,
|
| 17 |
+
EMBEDDING_MODEL, CHUNK_SIZE, CHUNK_OVERLAP, SOURCES, DEVICE
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
GUTENBERG_URL = "https://www.gutenberg.org/cache/epub/{id}/pg{id}.txt"
|
| 21 |
+
BATCH_SIZE = 50
|
| 22 |
+
SLEEP_BETWEEN_BATCHES = 2
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
def download_gutenberg(gutenberg_id: int, title: str) -> str:
|
| 26 |
+
url = GUTENBERG_URL.format(id=gutenberg_id)
|
| 27 |
+
print(f" Downloading {url}")
|
| 28 |
+
try:
|
| 29 |
+
resp = requests.get(url, timeout=30)
|
| 30 |
+
resp.raise_for_status()
|
| 31 |
+
return resp.text
|
| 32 |
+
except Exception as e:
|
| 33 |
+
print(f" ERROR: {e}")
|
| 34 |
+
return ""
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
def strip_gutenberg_boilerplate(text: str) -> str:
|
| 38 |
+
start_markers = [
|
| 39 |
+
"*** START OF THE PROJECT GUTENBERG",
|
| 40 |
+
"***START OF THE PROJECT GUTENBERG",
|
| 41 |
+
"*** START OF THIS PROJECT GUTENBERG",
|
| 42 |
+
]
|
| 43 |
+
end_markers = [
|
| 44 |
+
"*** END OF THE PROJECT GUTENBERG",
|
| 45 |
+
"***END OF THE PROJECT GUTENBERG",
|
| 46 |
+
"*** END OF THIS PROJECT GUTENBERG",
|
| 47 |
+
]
|
| 48 |
+
start_idx = 0
|
| 49 |
+
for marker in start_markers:
|
| 50 |
+
idx = text.find(marker)
|
| 51 |
+
if idx != -1:
|
| 52 |
+
start_idx = text.find("\n", idx) + 1
|
| 53 |
+
break
|
| 54 |
+
end_idx = len(text)
|
| 55 |
+
for marker in end_markers:
|
| 56 |
+
idx = text.find(marker)
|
| 57 |
+
if idx != -1:
|
| 58 |
+
end_idx = idx
|
| 59 |
+
break
|
| 60 |
+
return text[start_idx:end_idx].strip()
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
def get_embeddings() -> HuggingFaceEmbeddings:
|
| 64 |
+
print(f"Loading embedding model on {DEVICE}...")
|
| 65 |
+
return HuggingFaceEmbeddings(
|
| 66 |
+
model_name=EMBEDDING_MODEL,
|
| 67 |
+
model_kwargs={"device": DEVICE},
|
| 68 |
+
encode_kwargs={"prompt_name": "document", "normalize_embeddings": True},
|
| 69 |
+
query_encode_kwargs={"prompt_name": "query", "normalize_embeddings": True},
|
| 70 |
+
)
|
| 71 |
+
|
| 72 |
+
|
| 73 |
+
def get_indexed_titles(vectorstore: Chroma) -> set[str]:
|
| 74 |
+
result = vectorstore.get(include=["metadatas"])
|
| 75 |
+
return {m.get("title", "") for m in result["metadatas"]}
|
| 76 |
+
|
| 77 |
+
|
| 78 |
+
def ingest_source(source: dict, vectorstore: Chroma, splitter: RecursiveCharacterTextSplitter) -> int:
|
| 79 |
+
raw = download_gutenberg(source["gutenberg_id"], source["title"])
|
| 80 |
+
if not raw:
|
| 81 |
+
return 0
|
| 82 |
+
|
| 83 |
+
cleaned = strip_gutenberg_boilerplate(raw)
|
| 84 |
+
|
| 85 |
+
# Cache locally
|
| 86 |
+
DATA_DIR.mkdir(parents=True, exist_ok=True)
|
| 87 |
+
safe_name = f"{source['philosopher']}_{source['title'][:40].replace(' ', '_')}.txt"
|
| 88 |
+
(DATA_DIR / safe_name).write_text(cleaned, encoding="utf-8")
|
| 89 |
+
|
| 90 |
+
chunks = splitter.split_text(cleaned)
|
| 91 |
+
docs = [
|
| 92 |
+
Document(
|
| 93 |
+
page_content=chunk,
|
| 94 |
+
metadata={
|
| 95 |
+
"philosopher": source["philosopher"],
|
| 96 |
+
"title": source["title"],
|
| 97 |
+
"source": f"{source['philosopher']} — *{source['title']}*",
|
| 98 |
+
},
|
| 99 |
+
)
|
| 100 |
+
for chunk in chunks
|
| 101 |
+
]
|
| 102 |
+
|
| 103 |
+
for i in range(0, len(docs), BATCH_SIZE):
|
| 104 |
+
vectorstore.add_documents(docs[i : i + BATCH_SIZE])
|
| 105 |
+
if i + BATCH_SIZE < len(docs):
|
| 106 |
+
time.sleep(SLEEP_BETWEEN_BATCHES)
|
| 107 |
+
|
| 108 |
+
return len(docs)
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
def main() -> None:
|
| 112 |
+
rebuild = "--rebuild" in sys.argv
|
| 113 |
+
|
| 114 |
+
VECTORSTORE_DIR.mkdir(parents=True, exist_ok=True)
|
| 115 |
+
|
| 116 |
+
embeddings = get_embeddings()
|
| 117 |
+
splitter = RecursiveCharacterTextSplitter(
|
| 118 |
+
chunk_size=CHUNK_SIZE,
|
| 119 |
+
chunk_overlap=CHUNK_OVERLAP,
|
| 120 |
+
separators=["\n\n", "\n", ". ", " ", ""],
|
| 121 |
+
)
|
| 122 |
+
|
| 123 |
+
if rebuild and VECTORSTORE_DIR.exists():
|
| 124 |
+
import shutil
|
| 125 |
+
shutil.rmtree(VECTORSTORE_DIR)
|
| 126 |
+
VECTORSTORE_DIR.mkdir()
|
| 127 |
+
print("Vectorstore wiped for rebuild.")
|
| 128 |
+
|
| 129 |
+
vectorstore = Chroma(
|
| 130 |
+
collection_name="philosophers",
|
| 131 |
+
embedding_function=embeddings,
|
| 132 |
+
persist_directory=str(VECTORSTORE_DIR),
|
| 133 |
+
)
|
| 134 |
+
|
| 135 |
+
already_indexed = get_indexed_titles(vectorstore) if not rebuild else set()
|
| 136 |
+
total_new = 0
|
| 137 |
+
|
| 138 |
+
for source in SOURCES:
|
| 139 |
+
print(f"\n[{source['philosopher']}] {source['title']}")
|
| 140 |
+
if source["title"] in already_indexed:
|
| 141 |
+
print(" SKIPPED (already indexed)")
|
| 142 |
+
continue
|
| 143 |
+
|
| 144 |
+
n = ingest_source(source, vectorstore, splitter)
|
| 145 |
+
if n:
|
| 146 |
+
print(f" -> {n} chunks added")
|
| 147 |
+
total_new += n
|
| 148 |
+
time.sleep(1)
|
| 149 |
+
|
| 150 |
+
if total_new:
|
| 151 |
+
print(f"\nDone. {total_new} new chunks added to vectorstore.")
|
| 152 |
+
else:
|
| 153 |
+
print("\nNothing new to index.")
|
| 154 |
+
|
| 155 |
+
|
| 156 |
+
if __name__ == "__main__":
|
| 157 |
+
main()
|
rag_chain.py
ADDED
|
@@ -0,0 +1,353 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from functools import lru_cache
|
| 2 |
+
from pathlib import Path
|
| 3 |
+
from typing import Generator
|
| 4 |
+
|
| 5 |
+
from google import genai
|
| 6 |
+
from google.genai import types
|
| 7 |
+
from langchain_huggingface import HuggingFaceEmbeddings
|
| 8 |
+
from langchain_chroma import Chroma
|
| 9 |
+
from langchain_core.documents import Document
|
| 10 |
+
from langchain_text_splitters import RecursiveCharacterTextSplitter
|
| 11 |
+
from config import (
|
| 12 |
+
GOOGLE_API_KEY, GROQ_API_KEY, OPENROUTER_API_KEY,
|
| 13 |
+
LLM_OPTIONS, DEFAULT_LLM,
|
| 14 |
+
EMBEDDING_MODEL, VECTORSTORE_DIR, RETRIEVAL_K,
|
| 15 |
+
CHUNK_SIZE, CHUNK_OVERLAP, DEVICE, PROVIDER_KEYS,
|
| 16 |
+
USE_HYBRID_SEARCH,
|
| 17 |
+
)
|
| 18 |
+
|
| 19 |
+
SYSTEM_PROMPT = (
|
| 20 |
+
"You are a philosophical assistant with deep knowledge of Western philosophy, "
|
| 21 |
+
"particularly nihilism, absurdism, pessimism, existentialism, and epistemology. "
|
| 22 |
+
"Your answers are grounded in the primary texts provided as context.\n\n"
|
| 23 |
+
"Rules:\n"
|
| 24 |
+
"- Draw directly from the retrieved context passages.\n"
|
| 25 |
+
"- Always cite the philosopher and work "
|
| 26 |
+
"(e.g., 'As Nietzsche writes in *Thus Spoke Zarathustra*...').\n"
|
| 27 |
+
"- Be intellectually rigorous but accessible.\n"
|
| 28 |
+
"- If the context is insufficient, say so clearly.\n"
|
| 29 |
+
"- Present the philosophers' views faithfully without moralizing."
|
| 30 |
+
)
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
# ---------------------------------------------------------------------------
|
| 34 |
+
# Cached singletons
|
| 35 |
+
# ---------------------------------------------------------------------------
|
| 36 |
+
|
| 37 |
+
@lru_cache(maxsize=1)
|
| 38 |
+
def _get_genai_client() -> genai.Client:
|
| 39 |
+
return genai.Client(api_key=GOOGLE_API_KEY)
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
@lru_cache(maxsize=1)
|
| 43 |
+
def _get_embeddings() -> HuggingFaceEmbeddings:
|
| 44 |
+
return HuggingFaceEmbeddings(
|
| 45 |
+
model_name=EMBEDDING_MODEL,
|
| 46 |
+
model_kwargs={"device": DEVICE},
|
| 47 |
+
encode_kwargs={"prompt_name": "document", "normalize_embeddings": True},
|
| 48 |
+
query_encode_kwargs={"prompt_name": "query", "normalize_embeddings": True},
|
| 49 |
+
)
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
@lru_cache(maxsize=1)
|
| 53 |
+
def _get_vectorstore() -> Chroma:
|
| 54 |
+
return Chroma(
|
| 55 |
+
collection_name="philosophers",
|
| 56 |
+
embedding_function=_get_embeddings(),
|
| 57 |
+
persist_directory=str(VECTORSTORE_DIR),
|
| 58 |
+
)
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
@lru_cache(maxsize=1)
|
| 62 |
+
def _get_bm25_retriever():
|
| 63 |
+
"""Build BM25 index over the full KB (cached after first call)."""
|
| 64 |
+
from langchain_community.retrievers import BM25Retriever # requires rank-bm25
|
| 65 |
+
result = _get_vectorstore().get(include=["documents", "metadatas"])
|
| 66 |
+
docs = [
|
| 67 |
+
Document(page_content=d, metadata=m)
|
| 68 |
+
for d, m in zip(result["documents"], result["metadatas"])
|
| 69 |
+
if d.strip()
|
| 70 |
+
]
|
| 71 |
+
retriever = BM25Retriever.from_documents(docs)
|
| 72 |
+
retriever.k = RETRIEVAL_K
|
| 73 |
+
return retriever
|
| 74 |
+
|
| 75 |
+
|
| 76 |
+
# ---------------------------------------------------------------------------
|
| 77 |
+
# Public helpers
|
| 78 |
+
# ---------------------------------------------------------------------------
|
| 79 |
+
|
| 80 |
+
def vectorstore_exists() -> bool:
|
| 81 |
+
return (VECTORSTORE_DIR / "chroma.sqlite3").exists()
|
| 82 |
+
|
| 83 |
+
|
| 84 |
+
def get_all_philosophers() -> list[str]:
|
| 85 |
+
if not vectorstore_exists():
|
| 86 |
+
return ["All"]
|
| 87 |
+
result = _get_vectorstore().get(include=["metadatas"])
|
| 88 |
+
names = sorted({m["philosopher"] for m in result["metadatas"] if "philosopher" in m})
|
| 89 |
+
return ["All"] + names
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
def get_kb_stats() -> dict:
|
| 93 |
+
if not vectorstore_exists():
|
| 94 |
+
return {"total": 0, "sources": {}}
|
| 95 |
+
result = _get_vectorstore().get(include=["metadatas"])
|
| 96 |
+
sources: dict[str, set] = {}
|
| 97 |
+
for m in result["metadatas"]:
|
| 98 |
+
phil = m.get("philosopher", "Unknown")
|
| 99 |
+
title = m.get("title", "Unknown")
|
| 100 |
+
sources.setdefault(phil, set()).add(title)
|
| 101 |
+
return {"total": len(result["ids"]), "sources": sources}
|
| 102 |
+
|
| 103 |
+
|
| 104 |
+
# ---------------------------------------------------------------------------
|
| 105 |
+
# Retrieval
|
| 106 |
+
# ---------------------------------------------------------------------------
|
| 107 |
+
|
| 108 |
+
def retrieve_docs(
|
| 109 |
+
input_text: str, philosopher: str = "All"
|
| 110 |
+
) -> tuple[list[Document], list[float]]:
|
| 111 |
+
"""Hybrid BM25 + semantic retrieval.
|
| 112 |
+
|
| 113 |
+
Returns (docs, scores) where scores are cosine relevance ∈ [0, 1].
|
| 114 |
+
BM25-only results are tagged with score -1.0 (no embedding similarity).
|
| 115 |
+
"""
|
| 116 |
+
vectorstore = _get_vectorstore()
|
| 117 |
+
search_kwargs: dict = {"k": RETRIEVAL_K}
|
| 118 |
+
if philosopher != "All":
|
| 119 |
+
search_kwargs["filter"] = {"philosopher": philosopher}
|
| 120 |
+
|
| 121 |
+
pairs = vectorstore.similarity_search_with_relevance_scores(input_text, **search_kwargs)
|
| 122 |
+
|
| 123 |
+
if USE_HYBRID_SEARCH and philosopher == "All":
|
| 124 |
+
try:
|
| 125 |
+
bm25_docs = _get_bm25_retriever().invoke(input_text)
|
| 126 |
+
seen = {doc.page_content for doc, _ in pairs}
|
| 127 |
+
for doc in bm25_docs[:2]:
|
| 128 |
+
if doc.page_content not in seen:
|
| 129 |
+
pairs.append((doc, -1.0))
|
| 130 |
+
seen.add(doc.page_content)
|
| 131 |
+
except Exception:
|
| 132 |
+
pass
|
| 133 |
+
|
| 134 |
+
# Sort: semantic scores descending, BM25 appended at end
|
| 135 |
+
semantic = sorted([(d, s) for d, s in pairs if s >= 0], key=lambda x: x[1], reverse=True)
|
| 136 |
+
bm25_only = [(d, s) for d, s in pairs if s < 0]
|
| 137 |
+
pairs = (semantic + bm25_only)[: RETRIEVAL_K + 2]
|
| 138 |
+
|
| 139 |
+
return [d for d, _ in pairs], [s for _, s in pairs]
|
| 140 |
+
|
| 141 |
+
|
| 142 |
+
# ---------------------------------------------------------------------------
|
| 143 |
+
# LLM calls — non-streaming
|
| 144 |
+
# ---------------------------------------------------------------------------
|
| 145 |
+
|
| 146 |
+
def _call_llm(provider: str, model_id: str, context_str: str, input_text: str) -> str:
|
| 147 |
+
user_content = (
|
| 148 |
+
f"Context from philosophical texts:\n{context_str}\n\nQuestion: {input_text}"
|
| 149 |
+
)
|
| 150 |
+
|
| 151 |
+
if provider == "google":
|
| 152 |
+
if not GOOGLE_API_KEY:
|
| 153 |
+
env_var, site = PROVIDER_KEYS["google"]
|
| 154 |
+
raise ValueError(f"{env_var} not set. Get a free key at {site}")
|
| 155 |
+
response = _get_genai_client().models.generate_content(
|
| 156 |
+
model=model_id,
|
| 157 |
+
contents=user_content,
|
| 158 |
+
config=types.GenerateContentConfig(
|
| 159 |
+
system_instruction=SYSTEM_PROMPT, temperature=0.3
|
| 160 |
+
),
|
| 161 |
+
)
|
| 162 |
+
return response.text
|
| 163 |
+
|
| 164 |
+
elif provider == "groq":
|
| 165 |
+
if not GROQ_API_KEY:
|
| 166 |
+
env_var, site = PROVIDER_KEYS["groq"]
|
| 167 |
+
raise ValueError(f"{env_var} not set. Get a free key at {site}")
|
| 168 |
+
from openai import OpenAI
|
| 169 |
+
client = OpenAI(api_key=GROQ_API_KEY, base_url="https://api.groq.com/openai/v1")
|
| 170 |
+
|
| 171 |
+
elif provider == "openrouter":
|
| 172 |
+
if not OPENROUTER_API_KEY:
|
| 173 |
+
env_var, site = PROVIDER_KEYS["openrouter"]
|
| 174 |
+
raise ValueError(f"{env_var} not set. Get a free key at {site}")
|
| 175 |
+
from openai import OpenAI
|
| 176 |
+
client = OpenAI(
|
| 177 |
+
api_key=OPENROUTER_API_KEY,
|
| 178 |
+
base_url="https://openrouter.ai/api/v1",
|
| 179 |
+
default_headers={"HTTP-Referer": "https://github.com/Fikri645/philosopher-chat"},
|
| 180 |
+
)
|
| 181 |
+
else:
|
| 182 |
+
raise ValueError(f"Unknown provider: {provider!r}")
|
| 183 |
+
|
| 184 |
+
resp = client.chat.completions.create(
|
| 185 |
+
model=model_id,
|
| 186 |
+
messages=[
|
| 187 |
+
{"role": "system", "content": SYSTEM_PROMPT},
|
| 188 |
+
{"role": "user", "content": user_content},
|
| 189 |
+
],
|
| 190 |
+
temperature=0.3,
|
| 191 |
+
)
|
| 192 |
+
return resp.choices[0].message.content
|
| 193 |
+
|
| 194 |
+
|
| 195 |
+
# ---------------------------------------------------------------------------
|
| 196 |
+
# LLM calls — streaming
|
| 197 |
+
# ---------------------------------------------------------------------------
|
| 198 |
+
|
| 199 |
+
def stream_llm(
|
| 200 |
+
provider: str, model_id: str, context_str: str, input_text: str
|
| 201 |
+
) -> Generator[str, None, None]:
|
| 202 |
+
"""Yield text chunks for real-time streaming."""
|
| 203 |
+
user_content = (
|
| 204 |
+
f"Context from philosophical texts:\n{context_str}\n\nQuestion: {input_text}"
|
| 205 |
+
)
|
| 206 |
+
|
| 207 |
+
if provider == "google":
|
| 208 |
+
if not GOOGLE_API_KEY:
|
| 209 |
+
env_var, site = PROVIDER_KEYS["google"]
|
| 210 |
+
raise ValueError(f"{env_var} not set. Get a free key at {site}")
|
| 211 |
+
for chunk in _get_genai_client().models.generate_content_stream(
|
| 212 |
+
model=model_id,
|
| 213 |
+
contents=user_content,
|
| 214 |
+
config=types.GenerateContentConfig(
|
| 215 |
+
system_instruction=SYSTEM_PROMPT, temperature=0.3
|
| 216 |
+
),
|
| 217 |
+
):
|
| 218 |
+
if chunk.text:
|
| 219 |
+
yield chunk.text
|
| 220 |
+
|
| 221 |
+
elif provider in ("groq", "openrouter"):
|
| 222 |
+
if provider == "groq":
|
| 223 |
+
if not GROQ_API_KEY:
|
| 224 |
+
env_var, site = PROVIDER_KEYS["groq"]
|
| 225 |
+
raise ValueError(f"{env_var} not set. Get a free key at {site}")
|
| 226 |
+
from openai import OpenAI
|
| 227 |
+
client = OpenAI(
|
| 228 |
+
api_key=GROQ_API_KEY, base_url="https://api.groq.com/openai/v1"
|
| 229 |
+
)
|
| 230 |
+
else:
|
| 231 |
+
if not OPENROUTER_API_KEY:
|
| 232 |
+
env_var, site = PROVIDER_KEYS["openrouter"]
|
| 233 |
+
raise ValueError(f"{env_var} not set. Get a free key at {site}")
|
| 234 |
+
from openai import OpenAI
|
| 235 |
+
client = OpenAI(
|
| 236 |
+
api_key=OPENROUTER_API_KEY,
|
| 237 |
+
base_url="https://openrouter.ai/api/v1",
|
| 238 |
+
default_headers={
|
| 239 |
+
"HTTP-Referer": "https://github.com/Fikri645/philosopher-chat"
|
| 240 |
+
},
|
| 241 |
+
)
|
| 242 |
+
stream = client.chat.completions.create(
|
| 243 |
+
model=model_id,
|
| 244 |
+
messages=[
|
| 245 |
+
{"role": "system", "content": SYSTEM_PROMPT},
|
| 246 |
+
{"role": "user", "content": user_content},
|
| 247 |
+
],
|
| 248 |
+
temperature=0.3,
|
| 249 |
+
stream=True,
|
| 250 |
+
)
|
| 251 |
+
for chunk in stream:
|
| 252 |
+
content = chunk.choices[0].delta.content
|
| 253 |
+
if content:
|
| 254 |
+
yield content
|
| 255 |
+
|
| 256 |
+
else:
|
| 257 |
+
raise ValueError(f"Unknown provider: {provider!r}")
|
| 258 |
+
|
| 259 |
+
|
| 260 |
+
# ---------------------------------------------------------------------------
|
| 261 |
+
# Public query interface
|
| 262 |
+
# ---------------------------------------------------------------------------
|
| 263 |
+
|
| 264 |
+
def query(
|
| 265 |
+
input_text: str, philosopher: str = "All", llm_label: str = DEFAULT_LLM
|
| 266 |
+
) -> dict:
|
| 267 |
+
"""Non-streaming query. Returns answer + context + scores."""
|
| 268 |
+
provider, model_id = LLM_OPTIONS.get(llm_label, LLM_OPTIONS[DEFAULT_LLM])
|
| 269 |
+
docs, scores = retrieve_docs(input_text, philosopher)
|
| 270 |
+
context_str = "\n\n".join(d.page_content for d in docs)
|
| 271 |
+
answer = _call_llm(provider, model_id, context_str, input_text)
|
| 272 |
+
return {"answer": answer, "context": docs, "scores": scores}
|
| 273 |
+
|
| 274 |
+
|
| 275 |
+
# ---------------------------------------------------------------------------
|
| 276 |
+
# UMAP embedding visualization
|
| 277 |
+
# ---------------------------------------------------------------------------
|
| 278 |
+
|
| 279 |
+
def get_umap_data() -> dict | None:
|
| 280 |
+
"""Compute 2D UMAP projection of all KB embeddings.
|
| 281 |
+
|
| 282 |
+
Returns dict ready for plotly, or None if unavailable.
|
| 283 |
+
"""
|
| 284 |
+
import numpy as np
|
| 285 |
+
|
| 286 |
+
try:
|
| 287 |
+
import umap as umap_module # type: ignore
|
| 288 |
+
except ImportError:
|
| 289 |
+
return None
|
| 290 |
+
|
| 291 |
+
if not vectorstore_exists():
|
| 292 |
+
return None
|
| 293 |
+
|
| 294 |
+
result = _get_vectorstore().get(include=["embeddings", "metadatas", "documents"])
|
| 295 |
+
embeddings_raw = result.get("embeddings")
|
| 296 |
+
if embeddings_raw is None or len(embeddings_raw) == 0:
|
| 297 |
+
return None
|
| 298 |
+
|
| 299 |
+
embeddings = np.array(embeddings_raw)
|
| 300 |
+
reducer = umap_module.UMAP(
|
| 301 |
+
n_components=2, random_state=42, n_neighbors=15, min_dist=0.1
|
| 302 |
+
)
|
| 303 |
+
coords = reducer.fit_transform(embeddings)
|
| 304 |
+
|
| 305 |
+
return {
|
| 306 |
+
"x": coords[:, 0].tolist(),
|
| 307 |
+
"y": coords[:, 1].tolist(),
|
| 308 |
+
"philosopher": [m.get("philosopher", "Unknown") for m in result["metadatas"]],
|
| 309 |
+
"title": [m.get("title", "Unknown") for m in result["metadatas"]],
|
| 310 |
+
"preview": [d[:120].replace("\n", " ") + "…" for d in result["documents"]],
|
| 311 |
+
}
|
| 312 |
+
|
| 313 |
+
|
| 314 |
+
# ---------------------------------------------------------------------------
|
| 315 |
+
# KB management
|
| 316 |
+
# ---------------------------------------------------------------------------
|
| 317 |
+
|
| 318 |
+
def add_to_kb(file_path: str | Path, author: str, title: str) -> int:
|
| 319 |
+
"""Chunk, embed, and add a file to the vectorstore. Returns chunk count."""
|
| 320 |
+
file_path = Path(file_path)
|
| 321 |
+
|
| 322 |
+
if file_path.suffix.lower() == ".pdf":
|
| 323 |
+
from pypdf import PdfReader
|
| 324 |
+
reader = PdfReader(str(file_path))
|
| 325 |
+
text = "\n\n".join(
|
| 326 |
+
page.extract_text() for page in reader.pages if page.extract_text()
|
| 327 |
+
)
|
| 328 |
+
else:
|
| 329 |
+
text = file_path.read_text(encoding="utf-8", errors="replace")
|
| 330 |
+
|
| 331 |
+
if not text.strip():
|
| 332 |
+
raise ValueError("Could not extract text from the uploaded file.")
|
| 333 |
+
|
| 334 |
+
splitter = RecursiveCharacterTextSplitter(
|
| 335 |
+
chunk_size=CHUNK_SIZE,
|
| 336 |
+
chunk_overlap=CHUNK_OVERLAP,
|
| 337 |
+
separators=["\n\n", "\n", ". ", " ", ""],
|
| 338 |
+
)
|
| 339 |
+
docs = [
|
| 340 |
+
Document(
|
| 341 |
+
page_content=chunk,
|
| 342 |
+
metadata={
|
| 343 |
+
"philosopher": author.strip(),
|
| 344 |
+
"title": title.strip(),
|
| 345 |
+
"source": f"{author.strip()} — *{title.strip()}*",
|
| 346 |
+
},
|
| 347 |
+
)
|
| 348 |
+
for chunk in splitter.split_text(text)
|
| 349 |
+
]
|
| 350 |
+
|
| 351 |
+
_get_vectorstore().add_documents(docs)
|
| 352 |
+
_get_bm25_retriever.cache_clear() # invalidate BM25 index after KB change
|
| 353 |
+
return len(docs)
|
requirements.txt
ADDED
|
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
google-genai>=1.0.0
|
| 2 |
+
langchain>=0.3.0
|
| 3 |
+
langchain-google-genai>=2.0.0
|
| 4 |
+
langchain-huggingface>=0.1.0
|
| 5 |
+
langchain-community>=0.3.0
|
| 6 |
+
langchain-chroma>=0.1.4
|
| 7 |
+
langchain-text-splitters>=0.3.0
|
| 8 |
+
chromadb>=0.5.0
|
| 9 |
+
sentence-transformers>=3.0.0
|
| 10 |
+
gradio>=4.44.0
|
| 11 |
+
python-dotenv>=1.0.0
|
| 12 |
+
requests>=2.31.0
|
| 13 |
+
pypdf>=4.0.0
|
| 14 |
+
openai>=1.0.0
|
| 15 |
+
rank-bm25>=0.2.2
|
| 16 |
+
umap-learn>=0.5.0
|
| 17 |
+
plotly>=5.0.0
|