Cybersecurity-Panel

Sleeping

App Files Files Community

Sohan Kshirsagar commited on Aug 6, 2025

Commit

9fabeb7

1 Parent(s): 4f0dfc7

Backend Documentation Addition

Browse files

Files changed (6) hide show

multi_llm_chatbot_backend/README.md +178 -0
multi_llm_chatbot_backend/app/api/README.md +175 -0
multi_llm_chatbot_backend/app/core/README.md +200 -0
multi_llm_chatbot_backend/app/llm/README.md +173 -0
multi_llm_chatbot_backend/app/models/README.md +140 -0
multi_llm_chatbot_backend/app/utils/README.md +142 -0

multi_llm_chatbot_backend/README.md ADDED Viewed

	@@ -0,0 +1,178 @@

+# Multi-LLM Chatbot Backend
+A modular, extensible FastAPI backend for building an AI-powered research advisor chatbot that supports:
+- Multiple AI personas with configurable tone and behavior
+- Dynamic switching between Gemini (cloud) and Ollama (local) LLMs
+- Chat session persistence and context memory
+- Document upload, chunking, and retrieval using RAG
+- Rich export features (PDF, DOCX, TXT)
+- User authentication and JWT-based access control
+---
+## Backend Architecture
+```text
+User Input
+   ↓
+/chat-sequential → Orchestrator
+     ↓            ↙         ↘
+  SessionManager   ContextManager   RAGManager
+         ↓              ↓             ↓
+     MongoDB        Token Trimming   ChromaDB
+         ↓              ↓             ↓
+        Persisted Chat & Doc Context → LLM (Gemini/Ollama)
+```
+---
+## Features
+- Persona-based multi-agent conversation (`Theorist`, `Pragmatist`, etc.)
+- Provider switching (Gemini ↔ Ollama)
+- Context-aware response routing + top-K advisor selection
+- PDF, DOCX, and TXT file upload and semantic retrieval
+- Developer tools: debug personas, test RAG, export sessions
+- Secure authentication and session scoping
+---
+## Setup Instructions
+### 1. Clone and Configure Environment
+```bash
+git clone https://github.com/yourorg/multi-llm-chatbot-backend
+cd multi-llm-chatbot-backend
+cp .env.example .env  # already provided
+```
+### 2. Python Environment Setup
+```bash
+python -m venv venv
+source venv/bin/activate  # or venv\Scripts\activate on Windows
+pip install -r requirements.txt
+```
+### 3. Run the Server
+```bash
+uvicorn app.main:app --reload
+```
+> Server will be available at: `http://localhost:8000`
+---
+## FastAPI Routing & Modules
+| Folder | Description |
+|--------|-------------|
+| [`app/api`](./api_README.md) | REST API endpoints for chat, auth, RAG, exports |
+| [`app/core`](./core_README.md) | Main orchestration, context windows, database logic |
+| [`app/llm`](./llm_README.md) | Gemini + Ollama LLM wrappers |
+| [`app/models`](./models_README.md) | Persona and user schemas |
+| [`app/utils`](./utils_README.md) | File parsing, summaries, exports, vector helpers |
+---
+## Key Files
+### `main.py`
+- Loads env vars, sets up FastAPI instance with CORS and routers
+- Calls `connect_to_mongo()` on startup and `close_mongo_connection()` on shutdown
+- Imports and registers all routers (`auth`, `chat_sessions`, etc.)
+### `.env` (Sample Vars)
+```ini
+# MongoDB
+MONGODB_CONNECTION_STRING=mongodb://localhost:27017
+MONGODB_DATABASE_NAME=neon_ai_backend
+# Gemini API Key and model
+GEMINI_API_KEY=...  # Replace with real key
+GEMINI_MODEL=gemini-2.0-flash
+# Default provider
+DEFAULT_PROVIDER=gemini
+```
+### `requirements.txt`
+Includes:
+- **FastAPI**, **Uvicorn**: API framework and server
+- **httpx**: Async LLM request handler
+- **motor**, **pymongo**: MongoDB async access
+- **chromadb**, **sentence-transformers**: Vector database + embeddings
+- **PyPDF2**, **docx2txt**, **reportlab**: Document parsing and PDF generation
+- **passlib**, **python-jose**: Auth and security
+---
+## Persona Design & Context Handling
+- Personas defined in `app/models/default_personas.py`
+- Rich system prompts, styles, and epistemologies
+- Responses routed through `ImprovedChatOrchestrator`
+- Context trimmed and weighted via `ContextManager`
+---
+## Switching LLM Providers
+You can hot-swap models via API:
+```http
+POST /switch-provider
+{ "provider": "gemini" } | { "provider": "ollama" }
+```
+> Also supported: `/switch-model`, `/current-model`, `/current-provider`
+---
+## Document Upload + RAG
+- Upload PDFs, DOCX, or TXT to sessions
+- Text is extracted → chunked → embedded → stored in ChromaDB
+- Queried during conversation by persona-aware `EnhancedRAGManager`
+---
+## Export Options
+| Format | Export Endpoint |
+|--------|------------------|
+| PDF | `/export-chat?format=pdf` |
+| DOCX | `/export-chat?format=docx` |
+| TXT | `/export-chat?format=txt` |
+| Summary | `/chat-summary?format=pdf` |
+---
+## Developer & Debug Endpoints
+| Endpoint | Purpose |
+|----------|---------|
+| `/debug/personas` | See registered advisors and prompts |
+| `/debug/ranked-personas` | View top-K advisors for context |
+| `/debug/rag-status` | Run sample search to test document index |
+---
+## Status & Roadmap
+- [x] Multi-LLM backend ready (Gemini + Ollama)
+- [x] Document RAG + export system
+- [x] Session-aware persona routing
+- [x] JWT Auth + MongoDB user handling
+- [ ] UI enhancements and persona memory
+- [ ] Persona fine-tuning support (future)
+---
+For questions, contributions, or deployment help — feel free to reach out!

multi_llm_chatbot_backend/app/api/README.md ADDED Viewed

	@@ -0,0 +1,175 @@

+# `app/api` – REST API Layer for Multi-LLM Chatbot
+This module defines the complete FastAPI-based HTTP interface for all backend features, including chat, session management, RAG operations, provider switching, and document interaction.
+Each file in this directory defines route groups (`APIRouter`) to modularize functionality.
+---
+## API Directory Layout
+| File | Purpose |
+|------|---------|
+| `auth.py` | Handles user authentication (login, signup, token validation) |
+| `chat.py` | Core routes for LLM-backed chat, reply-to-advisor, and multi-turn flow |
+| `chat_sessions.py` | Stores user conversations and provides access to saved history |
+| `debug.py` | Developer tools: debug personas, RAG tests, ranking advisor responses |
+| `documents.py` | Upload, parse, index, and query documents via RAG |
+| `provider.py` | Switch between Gemini and Ollama providers |
+| `root.py` | Root `/` endpoint for heartbeat and versioning |
+| `sessions.py` | Tracks and resets session-specific in-memory context |
+| `utils.py` | Helpers used by multiple routers (e.g. session ID management) |
+---
+## `auth.py` – User Authentication API
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/signup` | `POST` | Register a new user |
+| `/login` | `POST` | Authenticate user and return access token |
+| `/me` | `GET` | Return current logged-in user |
+| `/healthcheck` | `GET` | Ping endpoint to check login status |
+Uses JWT-based Bearer token auth via FastAPI dependencies.
+---
+## `chat.py` – Chat Interaction
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/chat-sequential` | `POST` | Run a full advisor loop and return all persona responses |
+| `/reply-to-advisor` | `POST` | Ask a question to a specific advisor/persona |
+These routes handle:
+- Message routing via `ImprovedChatOrchestrator`
+- Persona-wise response generation
+- Embedding document-aware context
+- Returning consistent message structure
+---
+## `chat_sessions.py` – Persistent Storage of Conversations
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/chat-sessions` | `GET` | List all saved chat sessions |
+| `/chat-sessions/{id}` | `GET` | Retrieve specific chat session |
+| `/chat-sessions/{id}` | `DELETE` | Soft-delete a chat session |
+| `/chat-sessions/save` | `POST` | Save in-memory session to MongoDB |
+Saves message history, metadata, and uploaded files.
+---
+## `debug.py` – Developer Tools
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/debug/personas` | `GET` | List current personas, prompts, keywords |
+| `/debug/ranked-personas` | `GET` | Return top advisors for current session |
+| `/debug/rag-status` | `GET` | Run sample RAG query + return health info |
+Provides insight into:
+- Persona prompt preview
+- RAG test queries and indexed documents
+- Session size + truncation status
+---
+## `documents.py` – Document Upload and RAG
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/upload-document` | `POST` | Upload and parse a document for semantic search |
+| `/search-documents` | `POST` | RAG search using text query and persona context |
+| `/document-stats` | `GET` | Overview of documents uploaded to session |
+| `/uploaded-files` | `GET` | Return list of uploaded file names |
+| `/document-insights/{filename}` | `GET` | Get detailed metadata for a document |
+| `/export-chat` | `GET` | Export current or stored chat session (PDF, TXT, DOCX) |
+| `/chat-summary` | `GET` | Export summary generated by LLM (multi-format) |
+Supports file parsing (`PDF`, `DOCX`, `TXT`), chunking, embedding, and export.
+---
+## `provider.py` – LLM Provider Control
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/current-provider` | `GET` | Return currently active provider and model |
+| `/switch-provider` | `POST` | Dynamically switch between `gemini` and `ollama` |
+| `/current-model` | `GET` | Get currently loaded model name |
+| `/switch-model` | `POST` | Alias for switching based on model name |
+Changes are propagated by:
+- Creating new LLM client
+- Re-registering all personas
+---
+## `sessions.py` – In-Memory Session Management
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/context` | `GET` | Return current session context (messages, documents, stats) |
+| `/reset-session` | `POST` | Reset in-memory session or specific chat context |
+| `/session-stats` | `GET` | Return stats like message count, file size, timestamps |
+| `/active-sessions` | `GET` | Return list of all active in-memory sessions |
+| `/cleanup-sessions` | `POST` | Manually trigger expired session cleanup |
+Supports ephemeral sessions and reusable chat contexts (e.g. for documents).
+---
+## `utils.py` – Route-Level Utilities
+Defines shared helper:
+- `get_or_create_session_for_request(request)`
+- `get_or_create_session_for_request_async(request)`
+These parse session cookies or generate new session IDs, crucial for maintaining separation across:
+- In-memory ephemeral sessions
+- Document-linked long-term sessions
+---
+## `root.py` – API Healthcheck
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/` | `GET` | Return version + feature list |
+Simple heartbeat endpoint used for readiness probes and sanity checks.
+---
+## Auth Flow Integration
+Most routes use:
+```python
+Depends(get_current_active_user)
+```
+This ensures only logged-in users can:
+- Upload and retrieve files
+- Export summaries
+- Save or delete chat sessions
+JWT tokens are passed via the `Authorization: Bearer ...` header.
+---
+## High-Level Flow
+```text
+Frontend → /chat-sequential → orchestrator → personas → RAG + LLM → response[]
+        ↘ /upload-document → extractor → RAG chunks → indexed
+        ↘ /context or /reset-session → session_manager
+        ↘ /export-chat or /chat-summary → utils + formatter
+```
+---

multi_llm_chatbot_backend/app/core/README.md ADDED Viewed

	@@ -0,0 +1,200 @@

+# `app/core` – Application Core Logic
+This is the **central brain** of the multi-LLM chatbot system. It orchestrates user interaction, persona logic, context management, document-based retrieval (RAG), session tracking, authentication, and initialization.
+---
+## Overview of Modules
+| Module | Responsibility |
+|--------|----------------|
+| `auth.py` | Authentication (JWT, password hashing, user resolution) |
+| `bootstrap.py` | System startup logic: loads LLMs, personas, orchestrators |
+| `context.py` | Global per-session context (simplified storage) |
+| `context_manager.py` | Core context formatting & windowing for Gemini/Ollama |
+| `database.py` | MongoDB connection & index management |
+| `improved_orchestrator.py` | Main message routing engine: document-aware, multi-persona orchestrator |
+| `rag_manager.py` | RAG with ChromaDB: chunking, storage, semantic search |
+| `session_manager.py` | Full chat lifecycle tracker (in-memory) with RAG hooks |
+---
+## `auth.py` – Authentication System
+Handles secure authentication via:
+- Bcrypt hashing (`passlib`)
+- JWT creation and validation (`python-jose`)
+- Secure route access using FastAPI’s `Depends`
+### Functions
+- `get_password_hash(password)` – Hash password using bcrypt
+- `verify_password(plain, hashed)` – Verify password
+- `create_access_token(data)` – Return JWT (30-day expiry default)
+- `get_current_user()` – Decodes token and returns `User` model
+- `authenticate_user(email, password)` – Checks login credentials
+- `create_user_response(user)` – Returns `UserResponse` for frontend
+---
+## `bootstrap.py` – System Bootstrap
+Runs once on app startup to:
+- Determine the default LLM provider (Gemini or Ollama)
+- Initialize `ImprovedChatOrchestrator`
+- Inject personas using `get_default_personas(llm)`
+```python
+llm = create_llm_client()  # Gemini or Ollama
+chat_orchestrator = ImprovedChatOrchestrator()
+DEFAULT_PERSONAS = get_default_personas(llm)
+```
+Each persona is **registered** into the orchestrator using `.register_persona()`.
+---
+## `context.py` – Global Per-Session Context
+A basic context storage class (`GlobalSessionContext`) that keeps:
+- `full_log`: List of all messages
+- `uploaded_files`: Tracked files per session
+- `total_upload_size`: Helps enforce limits
+Used primarily in earlier versions or smaller contexts.
+---
+## `context_manager.py` – LLM Context Window Formatter
+This class builds optimized context windows for both Gemini and Ollama:
+### `ContextManager.prepare_context_for_llm()`
+Returns a `ContextWindow(messages, token_count, truncated)` with:
+- LLM-specific formatting
+- Automatic message pruning based on token limits
+- Recency- and relevance-weighted scoring for old messages
+- Automatic stop tokens, system prompts, and formatting
+### Key Features
+| Feature | Gemini | Ollama |
+|--------|--------|--------|
+| Format | JSON roles + parts | Flat prompt string |
+| Role Mapping | 'user', 'model' | 'User:', 'Assistant:' |
+| Chunking Strategy | Full doc as `Context Document:` | Plain text injection |
+| Stop Sequences | Customizable | Enforced via `stop[]` |
+Used **by all LLM clients** (Ollama/Gemini) and the **orchestrator**.
+---
+## `database.py` – MongoDB Connector
+- Uses `motor` for async MongoDB
+- Exposes `get_database()` to other modules
+- Automatically creates indexes on `users` and `chat_sessions`
+- Controlled via `.env` (`MONGODB_CONNECTION_STRING`)
+```python
+await connect_to_mongo()
+await close_mongo_connection()
+```
+---
+## `improved_orchestrator.py` – Brain of the Chatbot
+This is the main **message routing engine**.
+### Main Responsibilities
+- Route user input through:
+  - Clarification detection
+  - Document-aware context building
+  - Persona-level response generation
+- Aggregate responses from **multiple advisors**
+- Embed document-based context (RAG)
+### Key Functions
+- `process_message()` – Entry point for chat flow (calls all advisors)
+- `chat_with_persona()` – Talk to one specific advisor
+- `_generate_persona_responses()` – Routes through each registered persona
+- `_build_enhanced_context_for_persona()` – Combines conversation + document info
+### Extras
+- Document parsing hints (`"my thesis"`, `"section 2"`, etc.)
+- Top-K persona ranking (`get_top_personas()`)
+- Persona-specific fallback logic
+- Session reset/deletion
+Used by `/chat-sequential`, `/reply-to-advisor`, etc.
+---
+## `rag_manager.py` – RAG System for Docs
+Supports **vector-based retrieval** using:
+- Sentence Transformers (`all-MiniLM-L6-v2`)
+- ChromaDB (`PersistentClient` with metadata)
+- Metadata-aware enhanced chunking
+- Overlapping token window strategy
+- Section-wise classification
+### Core Components
+| Class | Role |
+|-------|------|
+| `RAGManager` | Standard chunking, basic RAG |
+| `EnhancedRAGManager` | Persona-aware + metadata-annotated vector chunks |
+### `EnhancedRAGManager` supports:
+- Section tagging (`methodology`, `theory`, etc.)
+- Multi-level filters (`session_id`, `filename`)
+- Attribution fields (`chunk_position`, `has_methodology`)
+- Relevance scoring and ranking
+Used by orchestrator when generating document-aware responses.
+---
+## `session_manager.py` – Chat Lifecycle Controller
+Handles:
+- In-memory session creation + cleanup (with expiration)
+- Tracks uploaded files and size
+- Holds message logs for each session
+- Links to RAG via `add_uploaded_file()` and `get_rag_stats()`
+### `ConversationContext`
+| Attribute | Description |
+|-----------|-------------|
+| `messages` | List of role-message entries |
+| `uploaded_files` | Filenames (content stored in RAG DB) |
+| `document_chunks_count` | Count of indexed doc chunks |
+| `last_retrieval_stats` | From last RAG search |
+| `created_at`, `last_accessed` | Session activity tracking |
+Includes:
+- Reset functions (`clear_all_data()`)
+- File-level message logging (`append_message()`)
+### `SessionManager`
+- Thread-safe via locks
+- Handles cleanup of expired sessions (`_cleanup_expired_sessions()`)
+- Returns statistics via `get_session_stats()`
+---
+## Interactions Summary
+```text
+User Input → Orchestrator
+             ↳ SessionManager → Context
+             ↳ RAGManager → Relevant Docs
+             ↳ LLMClient (Gemini/Ollama) ← ContextManager
+```

multi_llm_chatbot_backend/app/llm/README.md ADDED Viewed

	@@ -0,0 +1,173 @@

+# `app/llm` – LLM Integration Layer
+This module abstracts and implements communication with **local** and **cloud-based** large language models (LLMs) via interchangeable client wrappers.
+It defines:
+- A common interface for all LLM clients (`LLMClient`)
+- A wrapper for Google Gemini API (`ImprovedGeminiClient`)
+- A wrapper for Ollama local models (`ImprovedOllamaClient`)
+- A sentence transformer embedding model (`embedding_client.py`)
+---
+## Abstract Base – `llm_client.py`
+This file defines the **contract** that all LLM clients must follow.
+### `class LLMClient (ABC)`
+An abstract base class using Python’s `abc` module.
+```python
+@abstractmethod
+async def generate(system_prompt: str, context: List[dict], temperature: float, max_tokens: int) -> str
+```
+Every model wrapper must implement this coroutine to generate a response given:
+- A system prompt (persona instructions)
+- A user/system message context (list of `{role, content}` dicts)
+- A temperature (float 0.0–1.0, typically scaled from 0–10)
+- A token limit (integer)
+---
+## Gemini Client – `improved_gemini_client.py`
+### Overview
+- Communicates with **Google’s Gemini API** via `httpx`
+- Dynamically injects the `system_prompt` into the context using `context_manager`
+- Uses environment variables for API key and model name (`GEMINI_API_KEY`, `GEMINI_MODEL`)
+### Key Features
+| Feature | Description |
+|--------|-------------|
+| Context Prep | Uses `context_manager.prepare_context_for_llm()` to optimize message length |
+| Endpoint | `https://generativelanguage.googleapis.com/v1beta/models/{model_name}:generateContent` |
+| Content Format | Gemini expects JSON-formatted `contents`, not string prompts |
+| Safety Settings | Blocks harmful or explicit content categories |
+| Fallback Logic | Returns user-friendly error messages on bad or empty responses |
+| Token Limit | `maxOutputTokens` passed explicitly |
+### SafetyConfig JSON Example
+```json
+"safetySettings": [
+  {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
+  {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_MEDIUM_AND_ABOVE"}
+]
+```
+### Differences from Ollama
+- Requires an API key and runs over HTTPS
+- Parses deeply nested JSON structures (candidates → content → parts)
+- Strict token and safety controls
+- More structured response format
+---
+## Ollama Client – `improved_ollama_client.py`
+### Overview
+- Interfaces with a **local Ollama model server** (`http://localhost:11434`)
+- Sends prompts as raw formatted strings (not JSON "messages")
+- Uses `context_manager` to prepare prompt text
+### Key Features
+| Feature | Description |
+|--------|-------------|
+| Endpoint | `/api/generate` |
+| Payload | Flat prompt string + generation config |
+| Cleansing | Strips verbose, inconsistent prefixes or filler |
+| Quality Filter | Removes overly verbose or vague responses |
+| Robust | Recovers from connection and timeout failures |
+### Prompt Payload Example
+```json
+{
+  "model": "llama3.2:1b",
+  "prompt": "System: You are a helpful advisor...\nUser: What is...",
+  "stream": false,
+  "options": {
+    "temperature": 0.4,
+    "top_p": 0.9,
+    "top_k": 40,
+    "num_predict": 300,
+    "repeat_penalty": 1.1,
+    "stop": ["Student:", "User:", "Question:"]
+  }
+}
+```
+### Differences from Gemini
+| Area | Gemini | Ollama |
+|------|--------|--------|
+| Hosting | Cloud API | Local server |
+| Format | JSON "messages" | Raw string prompt |
+| Safety Filters | Yes | No |
+| Token Control | `maxOutputTokens` | `num_predict` |
+| Output | Structured parts | Single `response` string |
+| Response Cleaning | Minimal | Aggressively stripped of fluff |
+| Performance | High-quality, slower | Fast & offline |
+---
+## Embedding Model – `embedding_client.py`
+### Purpose
+Provides embedding vectors (used for semantic similarity and document retrieval) using `sentence-transformers`.
+### Uses:
+- Model: `all-MiniLM-L6-v2` (lightweight + performant)
+- Library: `sentence-transformers`
+- Function: `get_embedding(text: str) -> List[float]`
+```python
+embedding = get_embedding("example sentence")
+```
+### Notes
+- This module does **not** use Gemini embeddings (for cost and simplicity)
+- Can be upgraded later to use Gemini’s `embedding` endpoint or Ollama-based models with vector support
+---
+## Environment Variables
+| Variable | Description | Example |
+|----------|-------------|---------|
+| `GEMINI_API_KEY` | API key for Google Gemini | `AIzz123...` |
+| `GEMINI_MODEL` | Default Gemini model name | `gemini-2.0-flash` |
+| `OLLAMA_BASE_URL` | Local server base URL | `http://localhost:11434` |
+---
+## Context Management Integration
+Both clients use:
+```python
+context_window = context_manager.prepare_context_for_llm(...)
+```
+This ensures that:
+- Prompt fits within model limits
+- Truncation metadata is logged/debugged
+- Messages are pre-formatted or optimized per provider
+---
+## Error Handling
+All clients log internal issues and fallback to graceful responses. Each client handles:
+- Timeouts (`httpx.TimeoutException`)
+- API errors (`httpx.HTTPStatusError`, bad payloads)
+- Unexpected failures (fallback strings are returned)
+---

multi_llm_chatbot_backend/app/models/README.md ADDED Viewed

	@@ -0,0 +1,140 @@

+# `app/models` – Data Models & Persona Configuration
+This module defines the **core data structures** for users, chat sessions, and AI advisor personas in the Multi-LLM Chatbot Backend.
+It plays a foundational role in ensuring that:
+- User data and session state are **structured, validated, and serializable**
+- Persona behavior is **configurable, injectable, and extensible**
+---
+## Persona Model (`persona.py`)
+### `class Persona`
+Represents a single AI advisor with its own personality, tone, and domain of expertise.
+| Attribute       | Description |
+|----------------|-------------|
+| `id`           | Unique identifier for the persona |
+| `name`         | Human-readable display name |
+| `system_prompt`| The persona’s default LLM instruction |
+| `llm`          | Instance of the LLM client (Gemini/Ollama) |
+| `temperature`  | Controls creativity level (0–10 scale, converted to 0.0–1.0 internally) |
+### `respond()` method
+This asynchronous method generates a persona-specific reply using the provided context and desired `response_length` (short, medium, long). It uses a **system prompt + user messages** + length-based instructions.
+```python
+await persona.respond(context=messages, response_length="medium")
+```
+---
+## Persona Registry (`default_personas.py`)
+Defines and registers **all built-in personas** using detailed `system_prompt` templates and metadata.
+> These prompts define the tone, response style, formatting rules, document behavior, and epistemological approach of each advisor.
+### Available Personas
+- `methodologist`: Research methods and design expert
+- `theorist`: Theoretical frameworks and philosophy of science
+- `pragmatist`: Action-oriented coach with a focus on task execution
+- `socratic`: Socratic questioning mentor
+- `motivator`: Psychology-focused coach to build momentum
+- `critic`: Constructive reviewer with sharp academic critique
+- `storyteller`: Communication and storytelling specialist
+- `minimalist`: Minimal guidance, maximum clarity
+- `visionary`: Long-term strategy and innovation
+- `empathetic`: Emotionally aware advisor for mental health & motivation
+### Registry Functions
+| Function | Description |
+|---------|-------------|
+| `get_default_personas(llm)` | Returns a list of `Persona` instances with LLM injected |
+| `get_default_persona_prompt(pid)` | Returns only the `system_prompt` of a persona |
+| `is_valid_persona_id(pid)` | Checks if ID exists in registry |
+| `list_available_personas()` | Lists all persona IDs |
+---
+## User & Session Models (`user.py`)
+### `UserCreate` / `UserLogin`
+Pydantic models for request payloads during signup/login.
+### `User`
+Persistent user object, mapped to MongoDB using `_id` aliasing.
+| Field | Description |
+|-------|-------------|
+| `id` (`_id`) | MongoDB ObjectId |
+| `email`, `hashed_password` | Auth fields |
+| `academicStage`, `researchArea` | Optional metadata |
+| `created_at`, `last_login` | Timestamps |
+| `is_active` | Soft-deletion or block flag |
+### `UserResponse`
+Serialized user profile returned to frontend after login/token validation.
+---
+### `ChatSession`
+Stores a **single multi-turn conversation**. Used for RAG context, memory, and export.
+| Field | Description |
+|-------|-------------|
+| `id` | MongoDB `_id` |
+| `user_id` | Owner user’s ID |
+| `title` | Human-readable title |
+| `messages` | List of exchanged messages |
+| `created_at`, `updated_at` | Session lifecycle tracking |
+| `is_active` | Whether it is a deleted/inactive session |
+### `ChatSessionResponse`
+Returned when listing past sessions (lightweight response).
+---
+### `Token`
+Used as the unified login response structure:
+```json
+{
+  "access_token": "...",
+  "token_type": "bearer",
+  "user": { ... }
+}
+```
+---
+## Design Principles
+- All models are **fully compatible with FastAPI + Pydantic**
+- MongoDB integration uses `bson.ObjectId` support and aliases
+- Persona logic is **decoupled** from orchestration — easy to extend
+- System prompts are rich, structured, and **frontend-format aware** (markdown rules enforced)
+---
+## Next Steps
+This module is used by:
+- `core/improved_orchestrator.py` – Persona routing
+- `routes/chat.py` – Sequential chat + replies
+- `auth.py` – Token generation and validation
+- `documents.py` – Document-enhanced message generation
+> Add a new persona? Just extend `DEFAULT_PERSONAS` and restart the backend.

multi_llm_chatbot_backend/app/utils/README.md ADDED Viewed

	@@ -0,0 +1,142 @@

+# `app/utils` – Utility Modules for Summarization, Export, and Embeddings
+This directory includes reusable tools that support the backend application with:
+- Chat summarization for display/export
+- Document extraction and cleanup
+- File export to TXT, DOCX, and PDF formats
+- File upload validation
+- Persona-specific vector DB with ChromaDB
+These modules are loosely coupled and used across core routes, RAG logic, and export endpoints.
+---
+## `chat_summary.py` – Conversation Summarization
+This module provides summarization of past conversations using the LLM client.
+### Key Functions
+- `generate_summary_from_messages(messages, llm, max_tokens)` – Generates a formatted, bullet-style summary
+- `format_summary_for_text_export(summary_text)` – Cleans summary for export to PDF/DOCX/TXT
+- `parse_summary_to_blocks(summary_text)` – Converts summary to structured blocks (headings, lists, paragraphs)
+### Format Guidelines
+Summaries follow a markdown-style format with:
+- `**Section Name:**` for headings
+- `* Bullet Points` for insights and recommendations
+- Auto-trimming and line breaks for export formatting
+---
+## `chroma_client.py` – Persona-Specific Knowledge Store
+A minimal ChromaDB wrapper used to store and query persona-specific documents or embeddings.
+### Functions
+- `add_persona_doc(text, persona, doc_id)` – Add a new chunk/document for a persona
+- `query_persona_knowledge(query, persona)` – Query ChromaDB for a persona-specific response
+### Notes
+- Uses `./chroma_storage` as the default persistent path
+- Uses the local embedding model via `get_embedding()` from `embedding_client.py`
+---
+## `document_extractor.py` – File Text Extraction
+Supports extracting raw text from uploaded documents.
+### Supported Formats
+| Format | Content Type |
+|--------|---------------|
+| PDF    | `application/pdf` |
+| DOCX   | `application/vnd.openxmlformats-officedocument.wordprocessingml.document` |
+| TXT    | `text/plain` |
+### Key Function
+```python
+extract_text_from_file(file_bytes: bytes, content_type: str) -> str
+```
+Uses:
+- `PyPDF2` for PDFs
+- `docx2txt` for Word documents (via temp file)
+- UTF-8 decoding for plain text
+---
+## `file_export.py` – Export Chat & Summaries
+Exports content (chat logs or summaries) to the following formats:
+- `.txt`
+- `.docx` (Word)
+- `.pdf` (ReportLab)
+### Key Functions
+- `export_chat_as_file(content, format)` – Unified export method (calls generate_*)
+- `prepare_export_response()` – Returns a `StreamingResponse` with correct content-disposition
+### Formatting Functions
+- `generate_txt_file()` – Simple UTF-8 stream
+- `generate_docx_file()` – Paragraph-based Word file using `python-docx`
+- `generate_pdf_file()` – Uses ReportLab’s Platypus for chat-style layout
+- `generate_pdf_file_from_blocks()` – Used for structured summaries (heading, lists, etc.)
+All formats apply automatic cleanup and styling via:
+- `_clean_text_for_pdf()` and `_render_rich_text()`
+---
+## `file_limits.py` – Upload Size Checks
+Used to prevent users from uploading excessively large files in a session.
+### Configurable Limit
+```python
+MAX_TOTAL_UPLOAD_MB = 10
+```
+### Function
+- `is_within_upload_limit(session_id, new_file_bytes, session_context)` – Returns `True` if upload is within session cap
+Used by routes handling document uploads.
+---
+## Dependencies
+These modules are used in:
+| Module | Depends On |
+|--------|------------|
+| `rag_manager.py` | `document_extractor`, `file_limits` |
+| `chat_summary.py` | `llm_client` |
+| `routes/documents.py` | `document_extractor`, `file_limits` |
+| `routes/export.py` | `file_export`, `chat_summary` |
+---
+## Example Workflow
+```text
+Upload File → document_extractor.py → raw text
+            ↓
+      file_limits.py → check quota
+Chat History → chat_summary.py → formatted summary
+                          ↓
+                  file_export.py → TXT, DOCX, PDF
+Persona Notes → chroma_client.py → embedded in ChromaDB
+```