NeonCharlie-24
Fix/dup first message (#44)
e19e41d unverified
|
Raw
History Blame Contribute Delete
4.61 kB
# Multi-LLM Chatbot Backend
A modular, extensible FastAPI backend for building an AI-powered research advisor chatbot that supports:
- Multiple AI personas with configurable tone and behavior
- Dynamic switching between Gemini (cloud) and Ollama (local) LLMs
- Chat session persistence and context memory
- Document upload, chunking, and retrieval using RAG
- Rich export features (PDF, DOCX, TXT)
- User authentication and JWT-based access control
---
## Backend Architecture
```text
User Input
/chat-stream → Orchestrator
↓ ↙ ↘
SessionManager ContextManager RAGManager
↓ ↓ ↓
MongoDB Token Trimming ChromaDB
↓ ↓ ↓
Persisted Chat & Doc Context → LLM (Gemini/Ollama)
```
---
## Features
- Persona-based multi-agent conversation (`Theorist`, `Pragmatist`, etc.)
- Provider switching (Gemini ↔ Ollama)
- Context-aware response routing + top-K advisor selection
- PDF, DOCX, and TXT file upload and semantic retrieval
- Developer tools: debug personas, test RAG, export sessions
- Secure authentication and session scoping
---
## Setup Instructions
### 1. Clone and Configure Environment
```bash
git clone https://github.com/yourorg/multi-llm-chatbot-backend
cd multi-llm-chatbot-backend
cp .env.example .env # already provided
```
### 2. Python Environment Setup
```bash
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
```
### 3. Run the Server
```bash
uvicorn app.main:app --reload
```
> Server will be available at: `http://localhost:8000`
---
## FastAPI Routing & Modules
| Folder | Description |
|--------|-------------|
| [`app/api`](./api_README.md) | REST API endpoints for chat, auth, RAG, exports |
| [`app/core`](./core_README.md) | Main orchestration, context windows, database logic |
| [`app/llm`](./llm_README.md) | Gemini + Ollama LLM wrappers |
| [`app/models`](./models_README.md) | Persona and user schemas |
| [`app/utils`](./utils_README.md) | File parsing, summaries, exports, vector helpers |
---
## Key Files
### `main.py`
- Loads env vars, sets up FastAPI instance with CORS and routers
- Calls `connect_to_mongo()` on startup and `close_mongo_connection()` on shutdown
- Imports and registers all routers (`auth`, `chat_sessions`, etc.)
### `.env` (Sample Vars)
```ini
# MongoDB
MONGODB_CONNECTION_STRING=mongodb://localhost:27017
MONGODB_DATABASE_NAME=neon_ai_backend
# Gemini API Key and model
GEMINI_API_KEY=... # Replace with real key
GEMINI_MODEL=gemini-2.0-flash
# Default provider
DEFAULT_PROVIDER=gemini
```
### `requirements.txt`
Includes:
- **FastAPI**, **Uvicorn**: API framework and server
- **httpx**: Async LLM request handler
- **motor**, **pymongo**: MongoDB async access
- **chromadb**, **sentence-transformers**: Vector database + embeddings
- **PyPDF2**, **docx2txt**, **reportlab**: Document parsing and PDF generation
- **passlib**, **python-jose**: Auth and security
---
## Persona Design & Context Handling
- Personas defined in `app/models/default_personas.py`
- Rich system prompts, styles, and epistemologies
- Responses routed through `ImprovedChatOrchestrator`
- Context trimmed and weighted via `ContextManager`
---
## Switching LLM Providers
You can hot-swap models via API:
```http
POST /switch-provider
{ "provider": "gemini" } | { "provider": "ollama" }
```
> Also supported: `/switch-model`, `/current-model`, `/current-provider`
---
## Document Upload + RAG
- Upload PDFs, DOCX, or TXT to sessions
- Text is extracted → chunked → embedded → stored in ChromaDB
- Queried during conversation by persona-aware `EnhancedRAGManager`
---
## Export Options
| Format | Export Endpoint |
|--------|------------------|
| PDF | `/export-chat?format=pdf` |
| DOCX | `/export-chat?format=docx` |
| TXT | `/export-chat?format=txt` |
| Summary | `/chat-summary?format=pdf` |
---
## Developer & Debug Endpoints
| Endpoint | Purpose |
|----------|---------|
| `/debug/personas` | See registered advisors and prompts |
| `/debug/ranked-personas` | View top-K advisors for context |
| `/debug/rag-status` | Run sample search to test document index |
---
## Status & Roadmap
- [x] Multi-LLM backend ready (Gemini + Ollama)
- [x] Document RAG + export system
- [x] Session-aware persona routing
- [x] JWT Auth + MongoDB user handling
- [ ] UI enhancements and persona memory
- [ ] Persona fine-tuning support (future)
---
For questions, contributions, or deployment help — feel free to reach out!