Spaces:
Sleeping
Sleeping
| # 🧭 User Guide: How the App Works | |
| This application is more than just a simple "Chat with PDF" tool. It is a professional **Knowledge Management System** designed to bridge the gap between your raw data (files) and actionable intelligence (answers). | |
| To use it effectively, you need to understand the three core concepts: **Knowledge Bases**, **Assistants**, and **Chats**. | |
| --- | |
| ## 1. The Core Concepts (The Hierarchy) | |
| Think of this system like a real-world library, but supercharged with AI: | |
| 1. **📚 Knowledge Base (The Library Shelf):** | |
| * **Role:** The Storage Layer. | |
| * This is where you store your documents. It handles the "heavy lifting" of converting text into mathematical vectors. | |
| * It is *passive*. It doesn't "think"; it just remembers. | |
| * *Example:* "Legal Documents", "Personal Journals", "Python Textbooks". | |
| 2. **🤖 Assistant (The Librarian):** | |
| * **Role:** The Logic Layer. | |
| * This is the "Brain". It has a personality, instructions, and a memory of your conversation. | |
| * Crucially, **an Assistant must be assigned to a Knowledge Base**. Without a KB, the Assistant is just a generic chatbot (like standard ChatGPT) with no access to your private data. | |
| * *Example:* "Legal Advisor" (linked to Legal Docs), "Coding Tutor" (linked to Python Textbooks). | |
| 3. **💬 Chat (The Conversation):** | |
| * **Role:** The Interaction Layer. | |
| * This is a specific session between YOU and an ASSISTANT. | |
| * You can have multiple chats with the same assistant (e.g., "Session 1: Contract Review", "Session 2: Lease Agreement"). The Assistant remembers the context *within* a chat, but not *across* chats. | |
| --- | |
| ## 2. The Workflow: Step-by-Step | |
| Follow this flow to build your first RAG application. | |
| ### Step 1: Create a Knowledge Base | |
| * **Action:** Go to the "Knowledge Bases" tab and click "Create New". | |
| * **Why?** You need a container for your data. | |
| * **Behind the Scenes:** The system contacts the **Qdrant Vector Database** and creates a new "Collection". Think of this as creating a new, empty bucket specifically designed to hold high-dimensional vectors (numbers). | |
| ### Step 2: Ingest Documents | |
| * **Action:** Open your new Knowledge Base and upload files (PDFs, Text, etc.). | |
| * **Why?** The system needs to read, understand, and index your content. | |
| * **What happens (The "Magic"):** | |
| 1. **Parsing:** The file is cleaned and converted to text (see `PARSING_STRATEGIES.md`). | |
| 2. **Chunking:** The text is intelligently split into small, meaningful pieces (see `CHUNKING_STRATEGIES.md`). | |
| 3. **Embedding:** Each piece is run through an AI model to create a "vector" (see `EMBEDDING_MODELS.md`). | |
| * *Note:* This process is CPU-intensive. A large PDF might take a minute to process because the AI is reading every sentence carefully. | |
| ### Step 3: Create an Assistant | |
| * **Action:** Go to the "Assistants" tab and click "Create New". | |
| * **Why?** You need an interface to talk to your data. | |
| * **Crucial Settings:** | |
| * **Knowledge Base:** You MUST link the KB you created in Step 1. | |
| * **System Prompt:** This is where you program the AI's behavior. | |
| * *Bad Prompt:* "You are a bot." | |
| * *Good Prompt:* "You are a senior legal analyst. Answer strictly based on the provided documents. If the answer is not in the text, say 'I don't know'." | |
| * **Temperature:** Controls creativity. | |
| * `0.0` (Strict): Good for factual Q&A. | |
| * `0.7` (Creative): Good for brainstorming or writing. | |
| ### Step 4: Start a Chat | |
| * **Action:** Go to the "Chats" tab, select your "Finance Guru" assistant, and start a new chat. | |
| * **Why?** To begin the Q&A loop. | |
| * **The RAG Loop (What actually happens):** | |
| 1. **Query:** You ask, "What is the refund policy?" | |
| 2. **Retrieval:** The system pauses, searches your Knowledge Base for the top 5 chunks related to "refunds" (using the strategies in `RETRIEVAL_STRATEGIES.md`). | |
| 3. **Context Injection:** It secretly pastes those 5 chunks into the prompt, right before your question. | |
| 4. **Generation:** The LLM reads the chunks and generates an answer based *only* on that information. | |
| --- | |
| ## 3. Why this Architecture? | |
| Why separate "Assistants" from "Knowledge Bases"? Why not just upload a file and chat? | |
| **1. The Power of Reusability (One-to-Many)** | |
| Imagine you have a huge Knowledge Base called **"Company Financial Reports"** (100+ PDFs). Indexing this takes time and storage. You can create **multiple assistants** that use this *same* data for different purposes: | |
| * **Assistant A ("The Auditor"):** Strict, low temperature, looks for errors. | |
| * **Assistant B ("The Summarizer"):** Creative, high temperature, writes blog posts. | |
| * **Assistant C ("The Historian"):** Focuses on year-over-year trends. | |
| All three use the same expensive vector index. | |
| **2. Data Safety** | |
| If you delete an Assistant (e.g., "The Auditor"), **you do not delete the data**. Your Knowledge Base remains intact. This prevents accidental data loss and allows you to experiment with different Assistant personalities without re-uploading files. | |