Spaces:
Sleeping
Sleeping
Sohan Kshirsagar commited on
Commit ·
9fabeb7
1
Parent(s): 4f0dfc7
Backend Documentation Addition
Browse files- multi_llm_chatbot_backend/README.md +178 -0
- multi_llm_chatbot_backend/app/api/README.md +175 -0
- multi_llm_chatbot_backend/app/core/README.md +200 -0
- multi_llm_chatbot_backend/app/llm/README.md +173 -0
- multi_llm_chatbot_backend/app/models/README.md +140 -0
- multi_llm_chatbot_backend/app/utils/README.md +142 -0
multi_llm_chatbot_backend/README.md
ADDED
|
@@ -0,0 +1,178 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Multi-LLM Chatbot Backend
|
| 2 |
+
|
| 3 |
+
A modular, extensible FastAPI backend for building an AI-powered research advisor chatbot that supports:
|
| 4 |
+
- Multiple AI personas with configurable tone and behavior
|
| 5 |
+
- Dynamic switching between Gemini (cloud) and Ollama (local) LLMs
|
| 6 |
+
- Chat session persistence and context memory
|
| 7 |
+
- Document upload, chunking, and retrieval using RAG
|
| 8 |
+
- Rich export features (PDF, DOCX, TXT)
|
| 9 |
+
- User authentication and JWT-based access control
|
| 10 |
+
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
## Backend Architecture
|
| 14 |
+
|
| 15 |
+
```text
|
| 16 |
+
User Input
|
| 17 |
+
↓
|
| 18 |
+
/chat-sequential → Orchestrator
|
| 19 |
+
↓ ↙ ↘
|
| 20 |
+
SessionManager ContextManager RAGManager
|
| 21 |
+
↓ ↓ ↓
|
| 22 |
+
MongoDB Token Trimming ChromaDB
|
| 23 |
+
↓ ↓ ↓
|
| 24 |
+
Persisted Chat & Doc Context → LLM (Gemini/Ollama)
|
| 25 |
+
```
|
| 26 |
+
|
| 27 |
+
---
|
| 28 |
+
|
| 29 |
+
## Features
|
| 30 |
+
|
| 31 |
+
- Persona-based multi-agent conversation (`Theorist`, `Pragmatist`, etc.)
|
| 32 |
+
- Provider switching (Gemini ↔ Ollama)
|
| 33 |
+
- Context-aware response routing + top-K advisor selection
|
| 34 |
+
- PDF, DOCX, and TXT file upload and semantic retrieval
|
| 35 |
+
- Developer tools: debug personas, test RAG, export sessions
|
| 36 |
+
- Secure authentication and session scoping
|
| 37 |
+
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
+
## Setup Instructions
|
| 41 |
+
|
| 42 |
+
### 1. Clone and Configure Environment
|
| 43 |
+
|
| 44 |
+
```bash
|
| 45 |
+
git clone https://github.com/yourorg/multi-llm-chatbot-backend
|
| 46 |
+
cd multi-llm-chatbot-backend
|
| 47 |
+
cp .env.example .env # already provided
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
### 2. Python Environment Setup
|
| 51 |
+
|
| 52 |
+
```bash
|
| 53 |
+
python -m venv venv
|
| 54 |
+
source venv/bin/activate # or venv\Scripts\activate on Windows
|
| 55 |
+
|
| 56 |
+
pip install -r requirements.txt
|
| 57 |
+
```
|
| 58 |
+
|
| 59 |
+
### 3. Run the Server
|
| 60 |
+
|
| 61 |
+
```bash
|
| 62 |
+
uvicorn app.main:app --reload
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
> Server will be available at: `http://localhost:8000`
|
| 66 |
+
|
| 67 |
+
---
|
| 68 |
+
|
| 69 |
+
## FastAPI Routing & Modules
|
| 70 |
+
|
| 71 |
+
| Folder | Description |
|
| 72 |
+
|--------|-------------|
|
| 73 |
+
| [`app/api`](./api_README.md) | REST API endpoints for chat, auth, RAG, exports |
|
| 74 |
+
| [`app/core`](./core_README.md) | Main orchestration, context windows, database logic |
|
| 75 |
+
| [`app/llm`](./llm_README.md) | Gemini + Ollama LLM wrappers |
|
| 76 |
+
| [`app/models`](./models_README.md) | Persona and user schemas |
|
| 77 |
+
| [`app/utils`](./utils_README.md) | File parsing, summaries, exports, vector helpers |
|
| 78 |
+
|
| 79 |
+
---
|
| 80 |
+
|
| 81 |
+
## Key Files
|
| 82 |
+
|
| 83 |
+
### `main.py`
|
| 84 |
+
|
| 85 |
+
- Loads env vars, sets up FastAPI instance with CORS and routers
|
| 86 |
+
- Calls `connect_to_mongo()` on startup and `close_mongo_connection()` on shutdown
|
| 87 |
+
- Imports and registers all routers (`auth`, `chat_sessions`, etc.)
|
| 88 |
+
|
| 89 |
+
### `.env` (Sample Vars)
|
| 90 |
+
|
| 91 |
+
```ini
|
| 92 |
+
# MongoDB
|
| 93 |
+
MONGODB_CONNECTION_STRING=mongodb://localhost:27017
|
| 94 |
+
MONGODB_DATABASE_NAME=neon_ai_backend
|
| 95 |
+
|
| 96 |
+
# Gemini API Key and model
|
| 97 |
+
GEMINI_API_KEY=... # Replace with real key
|
| 98 |
+
GEMINI_MODEL=gemini-2.0-flash
|
| 99 |
+
|
| 100 |
+
# Default provider
|
| 101 |
+
DEFAULT_PROVIDER=gemini
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
### `requirements.txt`
|
| 105 |
+
|
| 106 |
+
Includes:
|
| 107 |
+
- **FastAPI**, **Uvicorn**: API framework and server
|
| 108 |
+
- **httpx**: Async LLM request handler
|
| 109 |
+
- **motor**, **pymongo**: MongoDB async access
|
| 110 |
+
- **chromadb**, **sentence-transformers**: Vector database + embeddings
|
| 111 |
+
- **PyPDF2**, **docx2txt**, **reportlab**: Document parsing and PDF generation
|
| 112 |
+
- **passlib**, **python-jose**: Auth and security
|
| 113 |
+
|
| 114 |
+
---
|
| 115 |
+
|
| 116 |
+
## Persona Design & Context Handling
|
| 117 |
+
|
| 118 |
+
- Personas defined in `app/models/default_personas.py`
|
| 119 |
+
- Rich system prompts, styles, and epistemologies
|
| 120 |
+
- Responses routed through `ImprovedChatOrchestrator`
|
| 121 |
+
- Context trimmed and weighted via `ContextManager`
|
| 122 |
+
|
| 123 |
+
---
|
| 124 |
+
|
| 125 |
+
## Switching LLM Providers
|
| 126 |
+
|
| 127 |
+
You can hot-swap models via API:
|
| 128 |
+
|
| 129 |
+
```http
|
| 130 |
+
POST /switch-provider
|
| 131 |
+
{ "provider": "gemini" } | { "provider": "ollama" }
|
| 132 |
+
```
|
| 133 |
+
|
| 134 |
+
> Also supported: `/switch-model`, `/current-model`, `/current-provider`
|
| 135 |
+
|
| 136 |
+
---
|
| 137 |
+
|
| 138 |
+
## Document Upload + RAG
|
| 139 |
+
|
| 140 |
+
- Upload PDFs, DOCX, or TXT to sessions
|
| 141 |
+
- Text is extracted → chunked → embedded → stored in ChromaDB
|
| 142 |
+
- Queried during conversation by persona-aware `EnhancedRAGManager`
|
| 143 |
+
|
| 144 |
+
---
|
| 145 |
+
|
| 146 |
+
## Export Options
|
| 147 |
+
|
| 148 |
+
| Format | Export Endpoint |
|
| 149 |
+
|--------|------------------|
|
| 150 |
+
| PDF | `/export-chat?format=pdf` |
|
| 151 |
+
| DOCX | `/export-chat?format=docx` |
|
| 152 |
+
| TXT | `/export-chat?format=txt` |
|
| 153 |
+
| Summary | `/chat-summary?format=pdf` |
|
| 154 |
+
|
| 155 |
+
---
|
| 156 |
+
|
| 157 |
+
## Developer & Debug Endpoints
|
| 158 |
+
|
| 159 |
+
| Endpoint | Purpose |
|
| 160 |
+
|----------|---------|
|
| 161 |
+
| `/debug/personas` | See registered advisors and prompts |
|
| 162 |
+
| `/debug/ranked-personas` | View top-K advisors for context |
|
| 163 |
+
| `/debug/rag-status` | Run sample search to test document index |
|
| 164 |
+
|
| 165 |
+
---
|
| 166 |
+
|
| 167 |
+
## Status & Roadmap
|
| 168 |
+
|
| 169 |
+
- [x] Multi-LLM backend ready (Gemini + Ollama)
|
| 170 |
+
- [x] Document RAG + export system
|
| 171 |
+
- [x] Session-aware persona routing
|
| 172 |
+
- [x] JWT Auth + MongoDB user handling
|
| 173 |
+
- [ ] UI enhancements and persona memory
|
| 174 |
+
- [ ] Persona fine-tuning support (future)
|
| 175 |
+
|
| 176 |
+
---
|
| 177 |
+
|
| 178 |
+
For questions, contributions, or deployment help — feel free to reach out!
|
multi_llm_chatbot_backend/app/api/README.md
ADDED
|
@@ -0,0 +1,175 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# `app/api` – REST API Layer for Multi-LLM Chatbot
|
| 2 |
+
|
| 3 |
+
This module defines the complete FastAPI-based HTTP interface for all backend features, including chat, session management, RAG operations, provider switching, and document interaction.
|
| 4 |
+
|
| 5 |
+
Each file in this directory defines route groups (`APIRouter`) to modularize functionality.
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## API Directory Layout
|
| 10 |
+
|
| 11 |
+
| File | Purpose |
|
| 12 |
+
|------|---------|
|
| 13 |
+
| `auth.py` | Handles user authentication (login, signup, token validation) |
|
| 14 |
+
| `chat.py` | Core routes for LLM-backed chat, reply-to-advisor, and multi-turn flow |
|
| 15 |
+
| `chat_sessions.py` | Stores user conversations and provides access to saved history |
|
| 16 |
+
| `debug.py` | Developer tools: debug personas, RAG tests, ranking advisor responses |
|
| 17 |
+
| `documents.py` | Upload, parse, index, and query documents via RAG |
|
| 18 |
+
| `provider.py` | Switch between Gemini and Ollama providers |
|
| 19 |
+
| `root.py` | Root `/` endpoint for heartbeat and versioning |
|
| 20 |
+
| `sessions.py` | Tracks and resets session-specific in-memory context |
|
| 21 |
+
| `utils.py` | Helpers used by multiple routers (e.g. session ID management) |
|
| 22 |
+
|
| 23 |
+
---
|
| 24 |
+
|
| 25 |
+
## `auth.py` – User Authentication API
|
| 26 |
+
|
| 27 |
+
| Endpoint | Method | Description |
|
| 28 |
+
|----------|--------|-------------|
|
| 29 |
+
| `/signup` | `POST` | Register a new user |
|
| 30 |
+
| `/login` | `POST` | Authenticate user and return access token |
|
| 31 |
+
| `/me` | `GET` | Return current logged-in user |
|
| 32 |
+
| `/healthcheck` | `GET` | Ping endpoint to check login status |
|
| 33 |
+
|
| 34 |
+
Uses JWT-based Bearer token auth via FastAPI dependencies.
|
| 35 |
+
|
| 36 |
+
---
|
| 37 |
+
|
| 38 |
+
## `chat.py` – Chat Interaction
|
| 39 |
+
|
| 40 |
+
| Endpoint | Method | Description |
|
| 41 |
+
|----------|--------|-------------|
|
| 42 |
+
| `/chat-sequential` | `POST` | Run a full advisor loop and return all persona responses |
|
| 43 |
+
| `/reply-to-advisor` | `POST` | Ask a question to a specific advisor/persona |
|
| 44 |
+
|
| 45 |
+
These routes handle:
|
| 46 |
+
- Message routing via `ImprovedChatOrchestrator`
|
| 47 |
+
- Persona-wise response generation
|
| 48 |
+
- Embedding document-aware context
|
| 49 |
+
- Returning consistent message structure
|
| 50 |
+
|
| 51 |
+
---
|
| 52 |
+
|
| 53 |
+
## `chat_sessions.py` – Persistent Storage of Conversations
|
| 54 |
+
|
| 55 |
+
| Endpoint | Method | Description |
|
| 56 |
+
|----------|--------|-------------|
|
| 57 |
+
| `/chat-sessions` | `GET` | List all saved chat sessions |
|
| 58 |
+
| `/chat-sessions/{id}` | `GET` | Retrieve specific chat session |
|
| 59 |
+
| `/chat-sessions/{id}` | `DELETE` | Soft-delete a chat session |
|
| 60 |
+
| `/chat-sessions/save` | `POST` | Save in-memory session to MongoDB |
|
| 61 |
+
|
| 62 |
+
Saves message history, metadata, and uploaded files.
|
| 63 |
+
|
| 64 |
+
---
|
| 65 |
+
|
| 66 |
+
## `debug.py` – Developer Tools
|
| 67 |
+
|
| 68 |
+
| Endpoint | Method | Description |
|
| 69 |
+
|----------|--------|-------------|
|
| 70 |
+
| `/debug/personas` | `GET` | List current personas, prompts, keywords |
|
| 71 |
+
| `/debug/ranked-personas` | `GET` | Return top advisors for current session |
|
| 72 |
+
| `/debug/rag-status` | `GET` | Run sample RAG query + return health info |
|
| 73 |
+
|
| 74 |
+
Provides insight into:
|
| 75 |
+
- Persona prompt preview
|
| 76 |
+
- RAG test queries and indexed documents
|
| 77 |
+
- Session size + truncation status
|
| 78 |
+
|
| 79 |
+
---
|
| 80 |
+
|
| 81 |
+
## `documents.py` – Document Upload and RAG
|
| 82 |
+
|
| 83 |
+
| Endpoint | Method | Description |
|
| 84 |
+
|----------|--------|-------------|
|
| 85 |
+
| `/upload-document` | `POST` | Upload and parse a document for semantic search |
|
| 86 |
+
| `/search-documents` | `POST` | RAG search using text query and persona context |
|
| 87 |
+
| `/document-stats` | `GET` | Overview of documents uploaded to session |
|
| 88 |
+
| `/uploaded-files` | `GET` | Return list of uploaded file names |
|
| 89 |
+
| `/document-insights/{filename}` | `GET` | Get detailed metadata for a document |
|
| 90 |
+
| `/export-chat` | `GET` | Export current or stored chat session (PDF, TXT, DOCX) |
|
| 91 |
+
| `/chat-summary` | `GET` | Export summary generated by LLM (multi-format) |
|
| 92 |
+
|
| 93 |
+
Supports file parsing (`PDF`, `DOCX`, `TXT`), chunking, embedding, and export.
|
| 94 |
+
|
| 95 |
+
---
|
| 96 |
+
|
| 97 |
+
## `provider.py` – LLM Provider Control
|
| 98 |
+
|
| 99 |
+
| Endpoint | Method | Description |
|
| 100 |
+
|----------|--------|-------------|
|
| 101 |
+
| `/current-provider` | `GET` | Return currently active provider and model |
|
| 102 |
+
| `/switch-provider` | `POST` | Dynamically switch between `gemini` and `ollama` |
|
| 103 |
+
| `/current-model` | `GET` | Get currently loaded model name |
|
| 104 |
+
| `/switch-model` | `POST` | Alias for switching based on model name |
|
| 105 |
+
|
| 106 |
+
Changes are propagated by:
|
| 107 |
+
- Creating new LLM client
|
| 108 |
+
- Re-registering all personas
|
| 109 |
+
|
| 110 |
+
---
|
| 111 |
+
|
| 112 |
+
## `sessions.py` – In-Memory Session Management
|
| 113 |
+
|
| 114 |
+
| Endpoint | Method | Description |
|
| 115 |
+
|----------|--------|-------------|
|
| 116 |
+
| `/context` | `GET` | Return current session context (messages, documents, stats) |
|
| 117 |
+
| `/reset-session` | `POST` | Reset in-memory session or specific chat context |
|
| 118 |
+
| `/session-stats` | `GET` | Return stats like message count, file size, timestamps |
|
| 119 |
+
| `/active-sessions` | `GET` | Return list of all active in-memory sessions |
|
| 120 |
+
| `/cleanup-sessions` | `POST` | Manually trigger expired session cleanup |
|
| 121 |
+
|
| 122 |
+
Supports ephemeral sessions and reusable chat contexts (e.g. for documents).
|
| 123 |
+
|
| 124 |
+
---
|
| 125 |
+
|
| 126 |
+
## `utils.py` – Route-Level Utilities
|
| 127 |
+
|
| 128 |
+
Defines shared helper:
|
| 129 |
+
|
| 130 |
+
- `get_or_create_session_for_request(request)`
|
| 131 |
+
- `get_or_create_session_for_request_async(request)`
|
| 132 |
+
|
| 133 |
+
These parse session cookies or generate new session IDs, crucial for maintaining separation across:
|
| 134 |
+
- In-memory ephemeral sessions
|
| 135 |
+
- Document-linked long-term sessions
|
| 136 |
+
|
| 137 |
+
---
|
| 138 |
+
|
| 139 |
+
## `root.py` – API Healthcheck
|
| 140 |
+
|
| 141 |
+
| Endpoint | Method | Description |
|
| 142 |
+
|----------|--------|-------------|
|
| 143 |
+
| `/` | `GET` | Return version + feature list |
|
| 144 |
+
|
| 145 |
+
Simple heartbeat endpoint used for readiness probes and sanity checks.
|
| 146 |
+
|
| 147 |
+
---
|
| 148 |
+
|
| 149 |
+
## Auth Flow Integration
|
| 150 |
+
|
| 151 |
+
Most routes use:
|
| 152 |
+
|
| 153 |
+
```python
|
| 154 |
+
Depends(get_current_active_user)
|
| 155 |
+
```
|
| 156 |
+
|
| 157 |
+
This ensures only logged-in users can:
|
| 158 |
+
- Upload and retrieve files
|
| 159 |
+
- Export summaries
|
| 160 |
+
- Save or delete chat sessions
|
| 161 |
+
|
| 162 |
+
JWT tokens are passed via the `Authorization: Bearer ...` header.
|
| 163 |
+
|
| 164 |
+
---
|
| 165 |
+
|
| 166 |
+
## High-Level Flow
|
| 167 |
+
|
| 168 |
+
```text
|
| 169 |
+
Frontend → /chat-sequential → orchestrator → personas → RAG + LLM → response[]
|
| 170 |
+
↘ /upload-document → extractor → RAG chunks → indexed
|
| 171 |
+
↘ /context or /reset-session → session_manager
|
| 172 |
+
↘ /export-chat or /chat-summary → utils + formatter
|
| 173 |
+
```
|
| 174 |
+
|
| 175 |
+
---
|
multi_llm_chatbot_backend/app/core/README.md
ADDED
|
@@ -0,0 +1,200 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# `app/core` – Application Core Logic
|
| 2 |
+
|
| 3 |
+
This is the **central brain** of the multi-LLM chatbot system. It orchestrates user interaction, persona logic, context management, document-based retrieval (RAG), session tracking, authentication, and initialization.
|
| 4 |
+
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
## Overview of Modules
|
| 8 |
+
|
| 9 |
+
| Module | Responsibility |
|
| 10 |
+
|--------|----------------|
|
| 11 |
+
| `auth.py` | Authentication (JWT, password hashing, user resolution) |
|
| 12 |
+
| `bootstrap.py` | System startup logic: loads LLMs, personas, orchestrators |
|
| 13 |
+
| `context.py` | Global per-session context (simplified storage) |
|
| 14 |
+
| `context_manager.py` | Core context formatting & windowing for Gemini/Ollama |
|
| 15 |
+
| `database.py` | MongoDB connection & index management |
|
| 16 |
+
| `improved_orchestrator.py` | Main message routing engine: document-aware, multi-persona orchestrator |
|
| 17 |
+
| `rag_manager.py` | RAG with ChromaDB: chunking, storage, semantic search |
|
| 18 |
+
| `session_manager.py` | Full chat lifecycle tracker (in-memory) with RAG hooks |
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## `auth.py` – Authentication System
|
| 23 |
+
|
| 24 |
+
Handles secure authentication via:
|
| 25 |
+
- Bcrypt hashing (`passlib`)
|
| 26 |
+
- JWT creation and validation (`python-jose`)
|
| 27 |
+
- Secure route access using FastAPI’s `Depends`
|
| 28 |
+
|
| 29 |
+
### Functions
|
| 30 |
+
|
| 31 |
+
- `get_password_hash(password)` – Hash password using bcrypt
|
| 32 |
+
- `verify_password(plain, hashed)` – Verify password
|
| 33 |
+
- `create_access_token(data)` – Return JWT (30-day expiry default)
|
| 34 |
+
- `get_current_user()` – Decodes token and returns `User` model
|
| 35 |
+
- `authenticate_user(email, password)` – Checks login credentials
|
| 36 |
+
- `create_user_response(user)` – Returns `UserResponse` for frontend
|
| 37 |
+
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
+
## `bootstrap.py` – System Bootstrap
|
| 41 |
+
|
| 42 |
+
Runs once on app startup to:
|
| 43 |
+
- Determine the default LLM provider (Gemini or Ollama)
|
| 44 |
+
- Initialize `ImprovedChatOrchestrator`
|
| 45 |
+
- Inject personas using `get_default_personas(llm)`
|
| 46 |
+
|
| 47 |
+
```python
|
| 48 |
+
llm = create_llm_client() # Gemini or Ollama
|
| 49 |
+
chat_orchestrator = ImprovedChatOrchestrator()
|
| 50 |
+
DEFAULT_PERSONAS = get_default_personas(llm)
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
Each persona is **registered** into the orchestrator using `.register_persona()`.
|
| 54 |
+
|
| 55 |
+
---
|
| 56 |
+
|
| 57 |
+
## `context.py` – Global Per-Session Context
|
| 58 |
+
|
| 59 |
+
A basic context storage class (`GlobalSessionContext`) that keeps:
|
| 60 |
+
- `full_log`: List of all messages
|
| 61 |
+
- `uploaded_files`: Tracked files per session
|
| 62 |
+
- `total_upload_size`: Helps enforce limits
|
| 63 |
+
|
| 64 |
+
Used primarily in earlier versions or smaller contexts.
|
| 65 |
+
|
| 66 |
+
---
|
| 67 |
+
|
| 68 |
+
## `context_manager.py` – LLM Context Window Formatter
|
| 69 |
+
|
| 70 |
+
This class builds optimized context windows for both Gemini and Ollama:
|
| 71 |
+
|
| 72 |
+
### `ContextManager.prepare_context_for_llm()`
|
| 73 |
+
Returns a `ContextWindow(messages, token_count, truncated)` with:
|
| 74 |
+
- LLM-specific formatting
|
| 75 |
+
- Automatic message pruning based on token limits
|
| 76 |
+
- Recency- and relevance-weighted scoring for old messages
|
| 77 |
+
- Automatic stop tokens, system prompts, and formatting
|
| 78 |
+
|
| 79 |
+
### Key Features
|
| 80 |
+
|
| 81 |
+
| Feature | Gemini | Ollama |
|
| 82 |
+
|--------|--------|--------|
|
| 83 |
+
| Format | JSON roles + parts | Flat prompt string |
|
| 84 |
+
| Role Mapping | 'user', 'model' | 'User:', 'Assistant:' |
|
| 85 |
+
| Chunking Strategy | Full doc as `Context Document:` | Plain text injection |
|
| 86 |
+
| Stop Sequences | Customizable | Enforced via `stop[]` |
|
| 87 |
+
|
| 88 |
+
Used **by all LLM clients** (Ollama/Gemini) and the **orchestrator**.
|
| 89 |
+
|
| 90 |
+
---
|
| 91 |
+
|
| 92 |
+
## `database.py` – MongoDB Connector
|
| 93 |
+
|
| 94 |
+
- Uses `motor` for async MongoDB
|
| 95 |
+
- Exposes `get_database()` to other modules
|
| 96 |
+
- Automatically creates indexes on `users` and `chat_sessions`
|
| 97 |
+
- Controlled via `.env` (`MONGODB_CONNECTION_STRING`)
|
| 98 |
+
|
| 99 |
+
```python
|
| 100 |
+
await connect_to_mongo()
|
| 101 |
+
await close_mongo_connection()
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
---
|
| 105 |
+
|
| 106 |
+
## `improved_orchestrator.py` – Brain of the Chatbot
|
| 107 |
+
|
| 108 |
+
This is the main **message routing engine**.
|
| 109 |
+
|
| 110 |
+
### Main Responsibilities
|
| 111 |
+
|
| 112 |
+
- Route user input through:
|
| 113 |
+
- Clarification detection
|
| 114 |
+
- Document-aware context building
|
| 115 |
+
- Persona-level response generation
|
| 116 |
+
- Aggregate responses from **multiple advisors**
|
| 117 |
+
- Embed document-based context (RAG)
|
| 118 |
+
|
| 119 |
+
### Key Functions
|
| 120 |
+
|
| 121 |
+
- `process_message()` – Entry point for chat flow (calls all advisors)
|
| 122 |
+
- `chat_with_persona()` – Talk to one specific advisor
|
| 123 |
+
- `_generate_persona_responses()` – Routes through each registered persona
|
| 124 |
+
- `_build_enhanced_context_for_persona()` – Combines conversation + document info
|
| 125 |
+
|
| 126 |
+
### Extras
|
| 127 |
+
- Document parsing hints (`"my thesis"`, `"section 2"`, etc.)
|
| 128 |
+
- Top-K persona ranking (`get_top_personas()`)
|
| 129 |
+
- Persona-specific fallback logic
|
| 130 |
+
- Session reset/deletion
|
| 131 |
+
|
| 132 |
+
Used by `/chat-sequential`, `/reply-to-advisor`, etc.
|
| 133 |
+
|
| 134 |
+
---
|
| 135 |
+
|
| 136 |
+
## `rag_manager.py` – RAG System for Docs
|
| 137 |
+
|
| 138 |
+
Supports **vector-based retrieval** using:
|
| 139 |
+
|
| 140 |
+
- Sentence Transformers (`all-MiniLM-L6-v2`)
|
| 141 |
+
- ChromaDB (`PersistentClient` with metadata)
|
| 142 |
+
- Metadata-aware enhanced chunking
|
| 143 |
+
- Overlapping token window strategy
|
| 144 |
+
- Section-wise classification
|
| 145 |
+
|
| 146 |
+
### Core Components
|
| 147 |
+
|
| 148 |
+
| Class | Role |
|
| 149 |
+
|-------|------|
|
| 150 |
+
| `RAGManager` | Standard chunking, basic RAG |
|
| 151 |
+
| `EnhancedRAGManager` | Persona-aware + metadata-annotated vector chunks |
|
| 152 |
+
|
| 153 |
+
### `EnhancedRAGManager` supports:
|
| 154 |
+
- Section tagging (`methodology`, `theory`, etc.)
|
| 155 |
+
- Multi-level filters (`session_id`, `filename`)
|
| 156 |
+
- Attribution fields (`chunk_position`, `has_methodology`)
|
| 157 |
+
- Relevance scoring and ranking
|
| 158 |
+
|
| 159 |
+
Used by orchestrator when generating document-aware responses.
|
| 160 |
+
|
| 161 |
+
---
|
| 162 |
+
|
| 163 |
+
## `session_manager.py` – Chat Lifecycle Controller
|
| 164 |
+
|
| 165 |
+
Handles:
|
| 166 |
+
- In-memory session creation + cleanup (with expiration)
|
| 167 |
+
- Tracks uploaded files and size
|
| 168 |
+
- Holds message logs for each session
|
| 169 |
+
- Links to RAG via `add_uploaded_file()` and `get_rag_stats()`
|
| 170 |
+
|
| 171 |
+
### `ConversationContext`
|
| 172 |
+
|
| 173 |
+
| Attribute | Description |
|
| 174 |
+
|-----------|-------------|
|
| 175 |
+
| `messages` | List of role-message entries |
|
| 176 |
+
| `uploaded_files` | Filenames (content stored in RAG DB) |
|
| 177 |
+
| `document_chunks_count` | Count of indexed doc chunks |
|
| 178 |
+
| `last_retrieval_stats` | From last RAG search |
|
| 179 |
+
| `created_at`, `last_accessed` | Session activity tracking |
|
| 180 |
+
|
| 181 |
+
Includes:
|
| 182 |
+
- Reset functions (`clear_all_data()`)
|
| 183 |
+
- File-level message logging (`append_message()`)
|
| 184 |
+
|
| 185 |
+
### `SessionManager`
|
| 186 |
+
|
| 187 |
+
- Thread-safe via locks
|
| 188 |
+
- Handles cleanup of expired sessions (`_cleanup_expired_sessions()`)
|
| 189 |
+
- Returns statistics via `get_session_stats()`
|
| 190 |
+
|
| 191 |
+
---
|
| 192 |
+
|
| 193 |
+
## Interactions Summary
|
| 194 |
+
|
| 195 |
+
```text
|
| 196 |
+
User Input → Orchestrator
|
| 197 |
+
↳ SessionManager → Context
|
| 198 |
+
↳ RAGManager → Relevant Docs
|
| 199 |
+
↳ LLMClient (Gemini/Ollama) ← ContextManager
|
| 200 |
+
```
|
multi_llm_chatbot_backend/app/llm/README.md
ADDED
|
@@ -0,0 +1,173 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# `app/llm` – LLM Integration Layer
|
| 2 |
+
|
| 3 |
+
This module abstracts and implements communication with **local** and **cloud-based** large language models (LLMs) via interchangeable client wrappers.
|
| 4 |
+
|
| 5 |
+
It defines:
|
| 6 |
+
- A common interface for all LLM clients (`LLMClient`)
|
| 7 |
+
- A wrapper for Google Gemini API (`ImprovedGeminiClient`)
|
| 8 |
+
- A wrapper for Ollama local models (`ImprovedOllamaClient`)
|
| 9 |
+
- A sentence transformer embedding model (`embedding_client.py`)
|
| 10 |
+
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
## Abstract Base – `llm_client.py`
|
| 14 |
+
|
| 15 |
+
This file defines the **contract** that all LLM clients must follow.
|
| 16 |
+
|
| 17 |
+
### `class LLMClient (ABC)`
|
| 18 |
+
|
| 19 |
+
An abstract base class using Python’s `abc` module.
|
| 20 |
+
|
| 21 |
+
```python
|
| 22 |
+
@abstractmethod
|
| 23 |
+
async def generate(system_prompt: str, context: List[dict], temperature: float, max_tokens: int) -> str
|
| 24 |
+
```
|
| 25 |
+
|
| 26 |
+
Every model wrapper must implement this coroutine to generate a response given:
|
| 27 |
+
- A system prompt (persona instructions)
|
| 28 |
+
- A user/system message context (list of `{role, content}` dicts)
|
| 29 |
+
- A temperature (float 0.0–1.0, typically scaled from 0–10)
|
| 30 |
+
- A token limit (integer)
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## Gemini Client – `improved_gemini_client.py`
|
| 35 |
+
|
| 36 |
+
### Overview
|
| 37 |
+
|
| 38 |
+
- Communicates with **Google’s Gemini API** via `httpx`
|
| 39 |
+
- Dynamically injects the `system_prompt` into the context using `context_manager`
|
| 40 |
+
- Uses environment variables for API key and model name (`GEMINI_API_KEY`, `GEMINI_MODEL`)
|
| 41 |
+
|
| 42 |
+
### Key Features
|
| 43 |
+
|
| 44 |
+
| Feature | Description |
|
| 45 |
+
|--------|-------------|
|
| 46 |
+
| Context Prep | Uses `context_manager.prepare_context_for_llm()` to optimize message length |
|
| 47 |
+
| Endpoint | `https://generativelanguage.googleapis.com/v1beta/models/{model_name}:generateContent` |
|
| 48 |
+
| Content Format | Gemini expects JSON-formatted `contents`, not string prompts |
|
| 49 |
+
| Safety Settings | Blocks harmful or explicit content categories |
|
| 50 |
+
| Fallback Logic | Returns user-friendly error messages on bad or empty responses |
|
| 51 |
+
| Token Limit | `maxOutputTokens` passed explicitly |
|
| 52 |
+
|
| 53 |
+
### SafetyConfig JSON Example
|
| 54 |
+
|
| 55 |
+
```json
|
| 56 |
+
"safetySettings": [
|
| 57 |
+
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
|
| 58 |
+
{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_MEDIUM_AND_ABOVE"}
|
| 59 |
+
]
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
### Differences from Ollama
|
| 63 |
+
- Requires an API key and runs over HTTPS
|
| 64 |
+
- Parses deeply nested JSON structures (candidates → content → parts)
|
| 65 |
+
- Strict token and safety controls
|
| 66 |
+
- More structured response format
|
| 67 |
+
|
| 68 |
+
---
|
| 69 |
+
|
| 70 |
+
## Ollama Client – `improved_ollama_client.py`
|
| 71 |
+
|
| 72 |
+
### Overview
|
| 73 |
+
|
| 74 |
+
- Interfaces with a **local Ollama model server** (`http://localhost:11434`)
|
| 75 |
+
- Sends prompts as raw formatted strings (not JSON "messages")
|
| 76 |
+
- Uses `context_manager` to prepare prompt text
|
| 77 |
+
|
| 78 |
+
### Key Features
|
| 79 |
+
|
| 80 |
+
| Feature | Description |
|
| 81 |
+
|--------|-------------|
|
| 82 |
+
| Endpoint | `/api/generate` |
|
| 83 |
+
| Payload | Flat prompt string + generation config |
|
| 84 |
+
| Cleansing | Strips verbose, inconsistent prefixes or filler |
|
| 85 |
+
| Quality Filter | Removes overly verbose or vague responses |
|
| 86 |
+
| Robust | Recovers from connection and timeout failures |
|
| 87 |
+
|
| 88 |
+
### Prompt Payload Example
|
| 89 |
+
|
| 90 |
+
```json
|
| 91 |
+
{
|
| 92 |
+
"model": "llama3.2:1b",
|
| 93 |
+
"prompt": "System: You are a helpful advisor...\nUser: What is...",
|
| 94 |
+
"stream": false,
|
| 95 |
+
"options": {
|
| 96 |
+
"temperature": 0.4,
|
| 97 |
+
"top_p": 0.9,
|
| 98 |
+
"top_k": 40,
|
| 99 |
+
"num_predict": 300,
|
| 100 |
+
"repeat_penalty": 1.1,
|
| 101 |
+
"stop": ["Student:", "User:", "Question:"]
|
| 102 |
+
}
|
| 103 |
+
}
|
| 104 |
+
```
|
| 105 |
+
|
| 106 |
+
### Differences from Gemini
|
| 107 |
+
|
| 108 |
+
| Area | Gemini | Ollama |
|
| 109 |
+
|------|--------|--------|
|
| 110 |
+
| Hosting | Cloud API | Local server |
|
| 111 |
+
| Format | JSON "messages" | Raw string prompt |
|
| 112 |
+
| Safety Filters | Yes | No |
|
| 113 |
+
| Token Control | `maxOutputTokens` | `num_predict` |
|
| 114 |
+
| Output | Structured parts | Single `response` string |
|
| 115 |
+
| Response Cleaning | Minimal | Aggressively stripped of fluff |
|
| 116 |
+
| Performance | High-quality, slower | Fast & offline |
|
| 117 |
+
|
| 118 |
+
---
|
| 119 |
+
|
| 120 |
+
## Embedding Model – `embedding_client.py`
|
| 121 |
+
|
| 122 |
+
### Purpose
|
| 123 |
+
|
| 124 |
+
Provides embedding vectors (used for semantic similarity and document retrieval) using `sentence-transformers`.
|
| 125 |
+
|
| 126 |
+
### Uses:
|
| 127 |
+
- Model: `all-MiniLM-L6-v2` (lightweight + performant)
|
| 128 |
+
- Library: `sentence-transformers`
|
| 129 |
+
- Function: `get_embedding(text: str) -> List[float]`
|
| 130 |
+
|
| 131 |
+
```python
|
| 132 |
+
embedding = get_embedding("example sentence")
|
| 133 |
+
```
|
| 134 |
+
|
| 135 |
+
### Notes
|
| 136 |
+
- This module does **not** use Gemini embeddings (for cost and simplicity)
|
| 137 |
+
- Can be upgraded later to use Gemini’s `embedding` endpoint or Ollama-based models with vector support
|
| 138 |
+
|
| 139 |
+
---
|
| 140 |
+
|
| 141 |
+
## Environment Variables
|
| 142 |
+
|
| 143 |
+
| Variable | Description | Example |
|
| 144 |
+
|----------|-------------|---------|
|
| 145 |
+
| `GEMINI_API_KEY` | API key for Google Gemini | `AIzz123...` |
|
| 146 |
+
| `GEMINI_MODEL` | Default Gemini model name | `gemini-2.0-flash` |
|
| 147 |
+
| `OLLAMA_BASE_URL` | Local server base URL | `http://localhost:11434` |
|
| 148 |
+
|
| 149 |
+
---
|
| 150 |
+
|
| 151 |
+
## Context Management Integration
|
| 152 |
+
|
| 153 |
+
Both clients use:
|
| 154 |
+
|
| 155 |
+
```python
|
| 156 |
+
context_window = context_manager.prepare_context_for_llm(...)
|
| 157 |
+
```
|
| 158 |
+
|
| 159 |
+
This ensures that:
|
| 160 |
+
- Prompt fits within model limits
|
| 161 |
+
- Truncation metadata is logged/debugged
|
| 162 |
+
- Messages are pre-formatted or optimized per provider
|
| 163 |
+
|
| 164 |
+
---
|
| 165 |
+
|
| 166 |
+
## Error Handling
|
| 167 |
+
|
| 168 |
+
All clients log internal issues and fallback to graceful responses. Each client handles:
|
| 169 |
+
- Timeouts (`httpx.TimeoutException`)
|
| 170 |
+
- API errors (`httpx.HTTPStatusError`, bad payloads)
|
| 171 |
+
- Unexpected failures (fallback strings are returned)
|
| 172 |
+
|
| 173 |
+
---
|
multi_llm_chatbot_backend/app/models/README.md
ADDED
|
@@ -0,0 +1,140 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# `app/models` – Data Models & Persona Configuration
|
| 2 |
+
|
| 3 |
+
This module defines the **core data structures** for users, chat sessions, and AI advisor personas in the Multi-LLM Chatbot Backend.
|
| 4 |
+
|
| 5 |
+
It plays a foundational role in ensuring that:
|
| 6 |
+
- User data and session state are **structured, validated, and serializable**
|
| 7 |
+
- Persona behavior is **configurable, injectable, and extensible**
|
| 8 |
+
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
## Persona Model (`persona.py`)
|
| 12 |
+
|
| 13 |
+
### `class Persona`
|
| 14 |
+
|
| 15 |
+
Represents a single AI advisor with its own personality, tone, and domain of expertise.
|
| 16 |
+
|
| 17 |
+
| Attribute | Description |
|
| 18 |
+
|----------------|-------------|
|
| 19 |
+
| `id` | Unique identifier for the persona |
|
| 20 |
+
| `name` | Human-readable display name |
|
| 21 |
+
| `system_prompt`| The persona’s default LLM instruction |
|
| 22 |
+
| `llm` | Instance of the LLM client (Gemini/Ollama) |
|
| 23 |
+
| `temperature` | Controls creativity level (0–10 scale, converted to 0.0–1.0 internally) |
|
| 24 |
+
|
| 25 |
+
### `respond()` method
|
| 26 |
+
|
| 27 |
+
This asynchronous method generates a persona-specific reply using the provided context and desired `response_length` (short, medium, long). It uses a **system prompt + user messages** + length-based instructions.
|
| 28 |
+
|
| 29 |
+
```python
|
| 30 |
+
await persona.respond(context=messages, response_length="medium")
|
| 31 |
+
```
|
| 32 |
+
|
| 33 |
+
---
|
| 34 |
+
|
| 35 |
+
## Persona Registry (`default_personas.py`)
|
| 36 |
+
|
| 37 |
+
Defines and registers **all built-in personas** using detailed `system_prompt` templates and metadata.
|
| 38 |
+
|
| 39 |
+
> These prompts define the tone, response style, formatting rules, document behavior, and epistemological approach of each advisor.
|
| 40 |
+
|
| 41 |
+
### Available Personas
|
| 42 |
+
|
| 43 |
+
- `methodologist`: Research methods and design expert
|
| 44 |
+
- `theorist`: Theoretical frameworks and philosophy of science
|
| 45 |
+
- `pragmatist`: Action-oriented coach with a focus on task execution
|
| 46 |
+
- `socratic`: Socratic questioning mentor
|
| 47 |
+
- `motivator`: Psychology-focused coach to build momentum
|
| 48 |
+
- `critic`: Constructive reviewer with sharp academic critique
|
| 49 |
+
- `storyteller`: Communication and storytelling specialist
|
| 50 |
+
- `minimalist`: Minimal guidance, maximum clarity
|
| 51 |
+
- `visionary`: Long-term strategy and innovation
|
| 52 |
+
- `empathetic`: Emotionally aware advisor for mental health & motivation
|
| 53 |
+
|
| 54 |
+
### Registry Functions
|
| 55 |
+
|
| 56 |
+
| Function | Description |
|
| 57 |
+
|---------|-------------|
|
| 58 |
+
| `get_default_personas(llm)` | Returns a list of `Persona` instances with LLM injected |
|
| 59 |
+
| `get_default_persona_prompt(pid)` | Returns only the `system_prompt` of a persona |
|
| 60 |
+
| `is_valid_persona_id(pid)` | Checks if ID exists in registry |
|
| 61 |
+
| `list_available_personas()` | Lists all persona IDs |
|
| 62 |
+
|
| 63 |
+
---
|
| 64 |
+
|
| 65 |
+
## User & Session Models (`user.py`)
|
| 66 |
+
|
| 67 |
+
### `UserCreate` / `UserLogin`
|
| 68 |
+
|
| 69 |
+
Pydantic models for request payloads during signup/login.
|
| 70 |
+
|
| 71 |
+
### `User`
|
| 72 |
+
|
| 73 |
+
Persistent user object, mapped to MongoDB using `_id` aliasing.
|
| 74 |
+
|
| 75 |
+
| Field | Description |
|
| 76 |
+
|-------|-------------|
|
| 77 |
+
| `id` (`_id`) | MongoDB ObjectId |
|
| 78 |
+
| `email`, `hashed_password` | Auth fields |
|
| 79 |
+
| `academicStage`, `researchArea` | Optional metadata |
|
| 80 |
+
| `created_at`, `last_login` | Timestamps |
|
| 81 |
+
| `is_active` | Soft-deletion or block flag |
|
| 82 |
+
|
| 83 |
+
### `UserResponse`
|
| 84 |
+
|
| 85 |
+
Serialized user profile returned to frontend after login/token validation.
|
| 86 |
+
|
| 87 |
+
---
|
| 88 |
+
|
| 89 |
+
### `ChatSession`
|
| 90 |
+
|
| 91 |
+
Stores a **single multi-turn conversation**. Used for RAG context, memory, and export.
|
| 92 |
+
|
| 93 |
+
| Field | Description |
|
| 94 |
+
|-------|-------------|
|
| 95 |
+
| `id` | MongoDB `_id` |
|
| 96 |
+
| `user_id` | Owner user’s ID |
|
| 97 |
+
| `title` | Human-readable title |
|
| 98 |
+
| `messages` | List of exchanged messages |
|
| 99 |
+
| `created_at`, `updated_at` | Session lifecycle tracking |
|
| 100 |
+
| `is_active` | Whether it is a deleted/inactive session |
|
| 101 |
+
|
| 102 |
+
### `ChatSessionResponse`
|
| 103 |
+
|
| 104 |
+
Returned when listing past sessions (lightweight response).
|
| 105 |
+
|
| 106 |
+
---
|
| 107 |
+
|
| 108 |
+
### `Token`
|
| 109 |
+
|
| 110 |
+
Used as the unified login response structure:
|
| 111 |
+
|
| 112 |
+
```json
|
| 113 |
+
{
|
| 114 |
+
"access_token": "...",
|
| 115 |
+
"token_type": "bearer",
|
| 116 |
+
"user": { ... }
|
| 117 |
+
}
|
| 118 |
+
```
|
| 119 |
+
|
| 120 |
+
---
|
| 121 |
+
|
| 122 |
+
## Design Principles
|
| 123 |
+
|
| 124 |
+
- All models are **fully compatible with FastAPI + Pydantic**
|
| 125 |
+
- MongoDB integration uses `bson.ObjectId` support and aliases
|
| 126 |
+
- Persona logic is **decoupled** from orchestration — easy to extend
|
| 127 |
+
- System prompts are rich, structured, and **frontend-format aware** (markdown rules enforced)
|
| 128 |
+
|
| 129 |
+
---
|
| 130 |
+
|
| 131 |
+
## Next Steps
|
| 132 |
+
|
| 133 |
+
This module is used by:
|
| 134 |
+
|
| 135 |
+
- `core/improved_orchestrator.py` – Persona routing
|
| 136 |
+
- `routes/chat.py` – Sequential chat + replies
|
| 137 |
+
- `auth.py` – Token generation and validation
|
| 138 |
+
- `documents.py` – Document-enhanced message generation
|
| 139 |
+
|
| 140 |
+
> Add a new persona? Just extend `DEFAULT_PERSONAS` and restart the backend.
|
multi_llm_chatbot_backend/app/utils/README.md
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# `app/utils` – Utility Modules for Summarization, Export, and Embeddings
|
| 2 |
+
|
| 3 |
+
This directory includes reusable tools that support the backend application with:
|
| 4 |
+
|
| 5 |
+
- Chat summarization for display/export
|
| 6 |
+
- Document extraction and cleanup
|
| 7 |
+
- File export to TXT, DOCX, and PDF formats
|
| 8 |
+
- File upload validation
|
| 9 |
+
- Persona-specific vector DB with ChromaDB
|
| 10 |
+
|
| 11 |
+
These modules are loosely coupled and used across core routes, RAG logic, and export endpoints.
|
| 12 |
+
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
## `chat_summary.py` – Conversation Summarization
|
| 16 |
+
|
| 17 |
+
This module provides summarization of past conversations using the LLM client.
|
| 18 |
+
|
| 19 |
+
### Key Functions
|
| 20 |
+
|
| 21 |
+
- `generate_summary_from_messages(messages, llm, max_tokens)` – Generates a formatted, bullet-style summary
|
| 22 |
+
- `format_summary_for_text_export(summary_text)` – Cleans summary for export to PDF/DOCX/TXT
|
| 23 |
+
- `parse_summary_to_blocks(summary_text)` – Converts summary to structured blocks (headings, lists, paragraphs)
|
| 24 |
+
|
| 25 |
+
### Format Guidelines
|
| 26 |
+
|
| 27 |
+
Summaries follow a markdown-style format with:
|
| 28 |
+
- `**Section Name:**` for headings
|
| 29 |
+
- `* Bullet Points` for insights and recommendations
|
| 30 |
+
- Auto-trimming and line breaks for export formatting
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## `chroma_client.py` – Persona-Specific Knowledge Store
|
| 35 |
+
|
| 36 |
+
A minimal ChromaDB wrapper used to store and query persona-specific documents or embeddings.
|
| 37 |
+
|
| 38 |
+
### Functions
|
| 39 |
+
|
| 40 |
+
- `add_persona_doc(text, persona, doc_id)` – Add a new chunk/document for a persona
|
| 41 |
+
- `query_persona_knowledge(query, persona)` – Query ChromaDB for a persona-specific response
|
| 42 |
+
|
| 43 |
+
### Notes
|
| 44 |
+
|
| 45 |
+
- Uses `./chroma_storage` as the default persistent path
|
| 46 |
+
- Uses the local embedding model via `get_embedding()` from `embedding_client.py`
|
| 47 |
+
|
| 48 |
+
---
|
| 49 |
+
|
| 50 |
+
## `document_extractor.py` – File Text Extraction
|
| 51 |
+
|
| 52 |
+
Supports extracting raw text from uploaded documents.
|
| 53 |
+
|
| 54 |
+
### Supported Formats
|
| 55 |
+
|
| 56 |
+
| Format | Content Type |
|
| 57 |
+
|--------|---------------|
|
| 58 |
+
| PDF | `application/pdf` |
|
| 59 |
+
| DOCX | `application/vnd.openxmlformats-officedocument.wordprocessingml.document` |
|
| 60 |
+
| TXT | `text/plain` |
|
| 61 |
+
|
| 62 |
+
### Key Function
|
| 63 |
+
|
| 64 |
+
```python
|
| 65 |
+
extract_text_from_file(file_bytes: bytes, content_type: str) -> str
|
| 66 |
+
```
|
| 67 |
+
|
| 68 |
+
Uses:
|
| 69 |
+
- `PyPDF2` for PDFs
|
| 70 |
+
- `docx2txt` for Word documents (via temp file)
|
| 71 |
+
- UTF-8 decoding for plain text
|
| 72 |
+
|
| 73 |
+
---
|
| 74 |
+
|
| 75 |
+
## `file_export.py` – Export Chat & Summaries
|
| 76 |
+
|
| 77 |
+
Exports content (chat logs or summaries) to the following formats:
|
| 78 |
+
- `.txt`
|
| 79 |
+
- `.docx` (Word)
|
| 80 |
+
- `.pdf` (ReportLab)
|
| 81 |
+
|
| 82 |
+
### Key Functions
|
| 83 |
+
|
| 84 |
+
- `export_chat_as_file(content, format)` – Unified export method (calls generate_*)
|
| 85 |
+
- `prepare_export_response()` – Returns a `StreamingResponse` with correct content-disposition
|
| 86 |
+
|
| 87 |
+
### Formatting Functions
|
| 88 |
+
|
| 89 |
+
- `generate_txt_file()` – Simple UTF-8 stream
|
| 90 |
+
- `generate_docx_file()` – Paragraph-based Word file using `python-docx`
|
| 91 |
+
- `generate_pdf_file()` – Uses ReportLab’s Platypus for chat-style layout
|
| 92 |
+
- `generate_pdf_file_from_blocks()` – Used for structured summaries (heading, lists, etc.)
|
| 93 |
+
|
| 94 |
+
All formats apply automatic cleanup and styling via:
|
| 95 |
+
- `_clean_text_for_pdf()` and `_render_rich_text()`
|
| 96 |
+
|
| 97 |
+
---
|
| 98 |
+
|
| 99 |
+
## `file_limits.py` – Upload Size Checks
|
| 100 |
+
|
| 101 |
+
Used to prevent users from uploading excessively large files in a session.
|
| 102 |
+
|
| 103 |
+
### Configurable Limit
|
| 104 |
+
|
| 105 |
+
```python
|
| 106 |
+
MAX_TOTAL_UPLOAD_MB = 10
|
| 107 |
+
```
|
| 108 |
+
|
| 109 |
+
### Function
|
| 110 |
+
|
| 111 |
+
- `is_within_upload_limit(session_id, new_file_bytes, session_context)` – Returns `True` if upload is within session cap
|
| 112 |
+
|
| 113 |
+
Used by routes handling document uploads.
|
| 114 |
+
|
| 115 |
+
---
|
| 116 |
+
|
| 117 |
+
## Dependencies
|
| 118 |
+
|
| 119 |
+
These modules are used in:
|
| 120 |
+
|
| 121 |
+
| Module | Depends On |
|
| 122 |
+
|--------|------------|
|
| 123 |
+
| `rag_manager.py` | `document_extractor`, `file_limits` |
|
| 124 |
+
| `chat_summary.py` | `llm_client` |
|
| 125 |
+
| `routes/documents.py` | `document_extractor`, `file_limits` |
|
| 126 |
+
| `routes/export.py` | `file_export`, `chat_summary` |
|
| 127 |
+
|
| 128 |
+
---
|
| 129 |
+
|
| 130 |
+
## Example Workflow
|
| 131 |
+
|
| 132 |
+
```text
|
| 133 |
+
Upload File → document_extractor.py → raw text
|
| 134 |
+
↓
|
| 135 |
+
file_limits.py → check quota
|
| 136 |
+
|
| 137 |
+
Chat History → chat_summary.py → formatted summary
|
| 138 |
+
↓
|
| 139 |
+
file_export.py → TXT, DOCX, PDF
|
| 140 |
+
|
| 141 |
+
Persona Notes → chroma_client.py → embedded in ChromaDB
|
| 142 |
+
```
|