Spaces:

YogPandya
/

personal-rag-backend

Sleeping

App Files Files Community

personal-rag-backend / docs /USER_GUIDE.md

github-actions[bot]

Deploying latest backend to Hugging Face

20256e7 12 days ago

preview code

raw

history blame contribute delete

5.12 kB

	# 🧭 User Guide: How the App Works

	This application is more than just a simple "Chat with PDF" tool. It is a professional Knowledge Management System designed to bridge the gap between your raw data (files) and actionable intelligence (answers).

	To use it effectively, you need to understand the three core concepts: Knowledge Bases, Assistants, and Chats.

	---

	## 1. The Core Concepts (The Hierarchy)

	Think of this system like a real-world library, but supercharged with AI:

	1. 📚 Knowledge Base (The Library Shelf):
	* Role: The Storage Layer.
	* This is where you store your documents. It handles the "heavy lifting" of converting text into mathematical vectors.
	* It is passive. It doesn't "think"; it just remembers.
	* Example: "Legal Documents", "Personal Journals", "Python Textbooks".

	2. 🤖 Assistant (The Librarian):
	* Role: The Logic Layer.
	* This is the "Brain". It has a personality, instructions, and a memory of your conversation.
	* Crucially, an Assistant must be assigned to a Knowledge Base. Without a KB, the Assistant is just a generic chatbot (like standard ChatGPT) with no access to your private data.
	* Example: "Legal Advisor" (linked to Legal Docs), "Coding Tutor" (linked to Python Textbooks).

	3. 💬 Chat (The Conversation):
	* Role: The Interaction Layer.
	* This is a specific session between YOU and an ASSISTANT.
	* You can have multiple chats with the same assistant (e.g., "Session 1: Contract Review", "Session 2: Lease Agreement"). The Assistant remembers the context within a chat, but not across chats.

	---

	## 2. The Workflow: Step-by-Step

	Follow this flow to build your first RAG application.

	### Step 1: Create a Knowledge Base
	* Action: Go to the "Knowledge Bases" tab and click "Create New".
	* Why? You need a container for your data.
	* Behind the Scenes: The system contacts the Qdrant Vector Database and creates a new "Collection". Think of this as creating a new, empty bucket specifically designed to hold high-dimensional vectors (numbers).

	### Step 2: Ingest Documents
	* Action: Open your new Knowledge Base and upload files (PDFs, Text, etc.).
	* Why? The system needs to read, understand, and index your content.
	* What happens (The "Magic"):
	1. Parsing: The file is cleaned and converted to text (see `PARSING_STRATEGIES.md`).
	2. Chunking: The text is intelligently split into small, meaningful pieces (see `CHUNKING_STRATEGIES.md`).
	3. Embedding: Each piece is run through an AI model to create a "vector" (see `EMBEDDING_MODELS.md`).
	* Note: This process is CPU-intensive. A large PDF might take a minute to process because the AI is reading every sentence carefully.

	### Step 3: Create an Assistant
	* Action: Go to the "Assistants" tab and click "Create New".
	* Why? You need an interface to talk to your data.
	* Crucial Settings:
	* Knowledge Base: You MUST link the KB you created in Step 1.
	* System Prompt: This is where you program the AI's behavior.
	* Bad Prompt: "You are a bot."
	* Good Prompt: "You are a senior legal analyst. Answer strictly based on the provided documents. If the answer is not in the text, say 'I don't know'."
	* Temperature: Controls creativity.
	* `0.0` (Strict): Good for factual Q&A.
	* `0.7` (Creative): Good for brainstorming or writing.

	### Step 4: Start a Chat
	* Action: Go to the "Chats" tab, select your "Finance Guru" assistant, and start a new chat.
	* Why? To begin the Q&A loop.
	* The RAG Loop (What actually happens):
	1. Query: You ask, "What is the refund policy?"
	2. Retrieval: The system pauses, searches your Knowledge Base for the top 5 chunks related to "refunds" (using the strategies in `RETRIEVAL_STRATEGIES.md`).
	3. Context Injection: It secretly pastes those 5 chunks into the prompt, right before your question.
	4. Generation: The LLM reads the chunks and generates an answer based only on that information.

	---

	## 3. Why this Architecture?

	Why separate "Assistants" from "Knowledge Bases"? Why not just upload a file and chat?

	1. The Power of Reusability (One-to-Many)
	Imagine you have a huge Knowledge Base called "Company Financial Reports" (100+ PDFs). Indexing this takes time and storage. You can create multiple assistants that use this same data for different purposes:
	* Assistant A ("The Auditor"): Strict, low temperature, looks for errors.
	* Assistant B ("The Summarizer"): Creative, high temperature, writes blog posts.
	* Assistant C ("The Historian"): Focuses on year-over-year trends.
	All three use the same expensive vector index.

	2. Data Safety
	If you delete an Assistant (e.g., "The Auditor"), you do not delete the data. Your Knowledge Base remains intact. This prevents accidental data loss and allows you to experiment with different Assistant personalities without re-uploading files.

	# 🧭 User Guide: How the App Works

	This application is more than just a simple "Chat with PDF" tool. It is a professional Knowledge Management System designed to bridge the gap between your raw data (files) and actionable intelligence (answers).

	To use it effectively, you need to understand the three core concepts: Knowledge Bases, Assistants, and Chats.

	---

	## 1. The Core Concepts (The Hierarchy)

	Think of this system like a real-world library, but supercharged with AI:

	1. 📚 Knowledge Base (The Library Shelf):
	* Role: The Storage Layer.
	* This is where you store your documents. It handles the "heavy lifting" of converting text into mathematical vectors.
	* It is passive. It doesn't "think"; it just remembers.
	* Example: "Legal Documents", "Personal Journals", "Python Textbooks".

	2. 🤖 Assistant (The Librarian):
	* Role: The Logic Layer.
	* This is the "Brain". It has a personality, instructions, and a memory of your conversation.
	* Crucially, an Assistant must be assigned to a Knowledge Base. Without a KB, the Assistant is just a generic chatbot (like standard ChatGPT) with no access to your private data.
	* Example: "Legal Advisor" (linked to Legal Docs), "Coding Tutor" (linked to Python Textbooks).

	3. 💬 Chat (The Conversation):
	* Role: The Interaction Layer.
	* This is a specific session between YOU and an ASSISTANT.
	* You can have multiple chats with the same assistant (e.g., "Session 1: Contract Review", "Session 2: Lease Agreement"). The Assistant remembers the context within a chat, but not across chats.

	---

	## 2. The Workflow: Step-by-Step

	Follow this flow to build your first RAG application.

	### Step 1: Create a Knowledge Base
	* Action: Go to the "Knowledge Bases" tab and click "Create New".
	* Why? You need a container for your data.
	* Behind the Scenes: The system contacts the Qdrant Vector Database and creates a new "Collection". Think of this as creating a new, empty bucket specifically designed to hold high-dimensional vectors (numbers).

	### Step 2: Ingest Documents
	* Action: Open your new Knowledge Base and upload files (PDFs, Text, etc.).
	* Why? The system needs to read, understand, and index your content.
	* What happens (The "Magic"):
	1. Parsing: The file is cleaned and converted to text (see `PARSING_STRATEGIES.md`).
	2. Chunking: The text is intelligently split into small, meaningful pieces (see `CHUNKING_STRATEGIES.md`).
	3. Embedding: Each piece is run through an AI model to create a "vector" (see `EMBEDDING_MODELS.md`).
	* Note: This process is CPU-intensive. A large PDF might take a minute to process because the AI is reading every sentence carefully.

	### Step 3: Create an Assistant
	* Action: Go to the "Assistants" tab and click "Create New".
	* Why? You need an interface to talk to your data.
	* Crucial Settings:
	* Knowledge Base: You MUST link the KB you created in Step 1.
	* System Prompt: This is where you program the AI's behavior.
	* Bad Prompt: "You are a bot."
	* Good Prompt: "You are a senior legal analyst. Answer strictly based on the provided documents. If the answer is not in the text, say 'I don't know'."
	* Temperature: Controls creativity.
	* `0.0` (Strict): Good for factual Q&A.
	* `0.7` (Creative): Good for brainstorming or writing.

	### Step 4: Start a Chat
	* Action: Go to the "Chats" tab, select your "Finance Guru" assistant, and start a new chat.
	* Why? To begin the Q&A loop.
	* The RAG Loop (What actually happens):
	1. Query: You ask, "What is the refund policy?"
	2. Retrieval: The system pauses, searches your Knowledge Base for the top 5 chunks related to "refunds" (using the strategies in `RETRIEVAL_STRATEGIES.md`).
	3. Context Injection: It secretly pastes those 5 chunks into the prompt, right before your question.
	4. Generation: The LLM reads the chunks and generates an answer based only on that information.

	---

	## 3. Why this Architecture?

	Why separate "Assistants" from "Knowledge Bases"? Why not just upload a file and chat?

	1. The Power of Reusability (One-to-Many)
	Imagine you have a huge Knowledge Base called "Company Financial Reports" (100+ PDFs). Indexing this takes time and storage. You can create multiple assistants that use this same data for different purposes:
	* Assistant A ("The Auditor"): Strict, low temperature, looks for errors.
	* Assistant B ("The Summarizer"): Creative, high temperature, writes blog posts.
	* Assistant C ("The Historian"): Focuses on year-over-year trends.
	All three use the same expensive vector index.

	2. Data Safety
	If you delete an Assistant (e.g., "The Auditor"), you do not delete the data. Your Knowledge Base remains intact. This prevents accidental data loss and allows you to experiment with different Assistant personalities without re-uploading files.