Spaces:
Sleeping
Sleeping
File size: 7,520 Bytes
01282eb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | # π€ RAG Chatbot β Application Report
**Course:** Makers Lab (Term 3) | **Institute:** SPJIMR
**Date:** February 10, 2026
---
## 1. What Is This Application?
This is an **AI-powered chatbot** that can answer questions by reading through your own documents. Instead of searching the internet, it looks through a personal "knowledge base" β a folder of text files you provide β and gives you accurate, sourced answers.
Think of it like having a **personal assistant who has read all your documents** and can instantly recall information from them when you ask a question.
---
## 2. The Core Idea: RAG (Retrieval-Augmented Generation)
**RAG** stands for **Retrieval-Augmented Generation**. In simple terms, it combines two steps:
| Step | What Happens | Analogy |
|------|-------------|---------|
| **1. Retrieval** | The system searches your documents and finds the most relevant paragraphs related to your question | Like flipping through a textbook to find the right page |
| **2. Generation** | An AI model reads those paragraphs and writes a clear, human-like answer | Like a student summarizing what they found in their own words |
> [!IMPORTANT]
> The AI **only answers from your documents** β it does not make things up or pull information from the internet. If the answer isn't in your files, it will tell you.
---
## 3. How It Works (Step by Step)
```mermaid
flowchart LR
A["π Your Documents"] --> B["βοΈ Split into Chunks"]
B --> C["π’ Convert to Numbers\n(Embeddings)"]
C --> D["ποΈ Store in FAISS\n(Vector Database)"]
E["β Your Question"] --> F["π’ Convert to Numbers"]
F --> G["π Find Similar Chunks"]
D --> G
G --> H["π€ AI Generates Answer"]
H --> I["π¬ Response Shown"]
```
### Breaking it down:
1. **You add documents** β Place `.txt` files in the `knowledge_base` folder (e.g., company policies, notes, research papers)
2. **Documents are split** β Large files are broken into smaller, manageable pieces called "chunks" (like cutting a book into individual pages)
3. **Chunks become numbers** β Each chunk is converted into a list of numbers (called an "embedding") that captures its meaning. This is done by an **Embedding Model** running on HuggingFace's servers
4. **Numbers are stored** β These numerical representations are saved in a **FAISS database** (a fast search engine for numbers)
5. **You ask a question** β Your question is also converted into numbers the same way
6. **Similar chunks are found** β The system compares your question's numbers with all the chunk numbers to find the closest matches (like finding the most relevant pages)
7. **AI writes the answer** β The matching chunks are sent to a **Language Model (LLM)** which reads them and generates a clear, natural-language answer
---
## 4. Key Features
### π Custom Knowledge Base
- Add any `.txt` files to the `knowledge_base` folder
- Reload anytime using the sidebar button
- Currently loaded with 6 documents (profile, experience, skills, projects, achievements, goals)
### π€ Multiple AI Models
The app lets you choose from different AI models:
| Model | Best For |
|-------|----------|
| Mistral 7B Instruct | General-purpose, reliable |
| Zephyr 7B | Conversational, friendly |
| Phi-3 Mini | Fast, efficient |
| Llama 3.2 3B | Meta's latest compact model |
| Gemma 2 2B | Google's lightweight model |
### π Configurable Retrieval
- **Chunk Size** (500β2000): Controls how big each document piece is
- **Number of Results** (1β5): How many relevant pieces to retrieve
### π Source Citations
Every answer includes an expandable section showing exactly which document fragments were used β so you can verify the answer.
### β‘ 100% Free
All processing happens via HuggingFace's free Inference API β no paid subscriptions or expensive GPU hardware needed.
### π¬ Chat History
The app remembers your conversation, so you can ask follow-up questions naturally.
---
## 5. Technology Stack
| Component | Technology | Role |
|-----------|-----------|------|
| User Interface | **Streamlit** | Creates the web-based chat interface |
| Document Loading | **LangChain** | Reads and processes text files |
| Text Splitting | **RecursiveCharacterTextSplitter** | Breaks documents into chunks intelligently |
| Embeddings | **HuggingFace API** (e.g., all-MiniLM-L6-v2) | Converts text into numerical representations |
| Vector Database | **FAISS** (Facebook AI Similarity Search) | Stores and searches embeddings efficiently |
| Answer Generation | **HuggingFace Inference API** | Runs the LLM to generate answers |
| Environment Mgmt | **python-dotenv** | Manages configuration securely |
---
## 6. How to Use the Application
### First-Time Setup
1. **Get a free HuggingFace account** at [huggingface.co](https://huggingface.co/join)
2. **Create a token** at [Settings β Tokens](https://huggingface.co/settings/tokens)
- Choose **"Fine-grained"** type
- Enable **"Make calls to Inference Providers"**
3. **Install dependencies**: `pip install -r requirements.txt`
4. **Add documents** to the `knowledge_base/` folder
### Running the App
```
streamlit run app.py
```
Then open `http://localhost:8501` in your browser.
### Asking Questions
1. Paste your HuggingFace token in the sidebar
2. Wait for the knowledge base to load (green β
confirmation)
3. Type your question in the chat box
4. View the AI-generated answer and optionally expand source documents
---
## 7. Project File Structure
```
ApplicationTest1/
βββ app.py β Main application (320 lines)
βββ requirements.txt β Python package dependencies
βββ .env β Stores your HuggingFace token
βββ README.md β Quick-start guide
βββ knowledge_base/ β Your documents go here
β βββ profile.txt
β βββ experience.txt
β βββ skills.txt
β βββ projects.txt
β βββ achievements.txt
β βββ goals.txt
βββ venv_rag/ β Python virtual environment
```
---
## 8. Error Handling
The application includes user-friendly error handling:
| Error | What It Means | Solution |
|-------|--------------|----------|
| **403 Forbidden** | Token doesn't have correct permissions | Recreate token with "Inference Providers" enabled |
| **Model loading** | AI model is starting up on the server | Wait 20β30 seconds and retry |
| **No documents found** | Knowledge base folder is empty | Add `.txt` files and reload |
| **Embedding error** | Issue converting text to numbers | Check token and selected model |
---
## 9. Key Takeaways
> [!TIP]
> **Why RAG matters:** Traditional AI models can "hallucinate" β make up information. RAG solves this by grounding AI answers in your actual documents, making it far more reliable for business and academic use.
- **RAG = Search + AI** β Combines document retrieval with AI generation
- **Your data stays private** β Documents are processed in your session only
- **Completely free** β No paid APIs, no GPU required
- **Customizable** β Swap models, tune chunk sizes, change the knowledge base anytime
- **Transparent** β Always shows which sources were used for each answer
---
*Report prepared for Makers Lab, SPJIMR β Term 3*
|