File size: 7,520 Bytes
01282eb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
# πŸ€– RAG Chatbot β€” Application Report

**Course:** Makers Lab (Term 3) | **Institute:** SPJIMR  
**Date:** February 10, 2026

---

## 1. What Is This Application?

This is an **AI-powered chatbot** that can answer questions by reading through your own documents. Instead of searching the internet, it looks through a personal "knowledge base" β€” a folder of text files you provide β€” and gives you accurate, sourced answers.

Think of it like having a **personal assistant who has read all your documents** and can instantly recall information from them when you ask a question.

---

## 2. The Core Idea: RAG (Retrieval-Augmented Generation)

**RAG** stands for **Retrieval-Augmented Generation**. In simple terms, it combines two steps:

| Step | What Happens | Analogy |
|------|-------------|---------|
| **1. Retrieval** | The system searches your documents and finds the most relevant paragraphs related to your question | Like flipping through a textbook to find the right page |
| **2. Generation** | An AI model reads those paragraphs and writes a clear, human-like answer | Like a student summarizing what they found in their own words |

> [!IMPORTANT]
> The AI **only answers from your documents** β€” it does not make things up or pull information from the internet. If the answer isn't in your files, it will tell you.

---

## 3. How It Works (Step by Step)

```mermaid

flowchart LR

    A["πŸ“„ Your Documents"] --> B["βœ‚οΈ Split into Chunks"]

    B --> C["πŸ”’ Convert to Numbers\n(Embeddings)"]

    C --> D["πŸ—„οΈ Store in FAISS\n(Vector Database)"]

    E["❓ Your Question"] --> F["πŸ”’ Convert to Numbers"]

    F --> G["πŸ” Find Similar Chunks"]

    D --> G

    G --> H["πŸ€– AI Generates Answer"]

    H --> I["πŸ’¬ Response Shown"]

```

### Breaking it down:

1. **You add documents** β€” Place `.txt` files in the `knowledge_base` folder (e.g., company policies, notes, research papers)

2. **Documents are split** β€” Large files are broken into smaller, manageable pieces called "chunks" (like cutting a book into individual pages)

3. **Chunks become numbers** β€” Each chunk is converted into a list of numbers (called an "embedding") that captures its meaning. This is done by an **Embedding Model** running on HuggingFace's servers

4. **Numbers are stored** β€” These numerical representations are saved in a **FAISS database** (a fast search engine for numbers)

5. **You ask a question** β€” Your question is also converted into numbers the same way

6. **Similar chunks are found** β€” The system compares your question's numbers with all the chunk numbers to find the closest matches (like finding the most relevant pages)

7. **AI writes the answer** β€” The matching chunks are sent to a **Language Model (LLM)** which reads them and generates a clear, natural-language answer

---

## 4. Key Features

### πŸ“š Custom Knowledge Base
- Add any `.txt` files to the `knowledge_base` folder
- Reload anytime using the sidebar button
- Currently loaded with 6 documents (profile, experience, skills, projects, achievements, goals)

### πŸ€– Multiple AI Models
The app lets you choose from different AI models:

| Model | Best For |
|-------|----------|
| Mistral 7B Instruct | General-purpose, reliable |
| Zephyr 7B | Conversational, friendly |
| Phi-3 Mini | Fast, efficient |
| Llama 3.2 3B | Meta's latest compact model |
| Gemma 2 2B | Google's lightweight model |

### πŸ” Configurable Retrieval
- **Chunk Size** (500–2000): Controls how big each document piece is
- **Number of Results** (1–5): How many relevant pieces to retrieve

### πŸ“„ Source Citations
Every answer includes an expandable section showing exactly which document fragments were used β€” so you can verify the answer.

### ⚑ 100% Free
All processing happens via HuggingFace's free Inference API β€” no paid subscriptions or expensive GPU hardware needed.

### πŸ’¬ Chat History
The app remembers your conversation, so you can ask follow-up questions naturally.

---

## 5. Technology Stack

| Component | Technology | Role |
|-----------|-----------|------|
| User Interface | **Streamlit** | Creates the web-based chat interface |
| Document Loading | **LangChain** | Reads and processes text files |
| Text Splitting | **RecursiveCharacterTextSplitter** | Breaks documents into chunks intelligently |
| Embeddings | **HuggingFace API** (e.g., all-MiniLM-L6-v2) | Converts text into numerical representations |
| Vector Database | **FAISS** (Facebook AI Similarity Search) | Stores and searches embeddings efficiently |
| Answer Generation | **HuggingFace Inference API** | Runs the LLM to generate answers |
| Environment Mgmt | **python-dotenv** | Manages configuration securely |

---

## 6. How to Use the Application

### First-Time Setup
1. **Get a free HuggingFace account** at [huggingface.co](https://huggingface.co/join)
2. **Create a token** at [Settings β†’ Tokens](https://huggingface.co/settings/tokens)
   - Choose **"Fine-grained"** type
   - Enable **"Make calls to Inference Providers"**
3. **Install dependencies**: `pip install -r requirements.txt`
4. **Add documents** to the `knowledge_base/` folder

### Running the App
```

streamlit run app.py

```
Then open `http://localhost:8501` in your browser.

### Asking Questions
1. Paste your HuggingFace token in the sidebar
2. Wait for the knowledge base to load (green βœ… confirmation)
3. Type your question in the chat box
4. View the AI-generated answer and optionally expand source documents

---

## 7. Project File Structure

```

ApplicationTest1/

β”œβ”€β”€ app.py                  ← Main application (320 lines)

β”œβ”€β”€ requirements.txt        ← Python package dependencies

β”œβ”€β”€ .env                    ← Stores your HuggingFace token

β”œβ”€β”€ README.md               ← Quick-start guide

β”œβ”€β”€ knowledge_base/         ← Your documents go here

β”‚   β”œβ”€β”€ profile.txt

β”‚   β”œβ”€β”€ experience.txt

β”‚   β”œβ”€β”€ skills.txt

β”‚   β”œβ”€β”€ projects.txt

β”‚   β”œβ”€β”€ achievements.txt

β”‚   └── goals.txt

└── venv_rag/               ← Python virtual environment

```

---

## 8. Error Handling

The application includes user-friendly error handling:

| Error | What It Means | Solution |
|-------|--------------|----------|
| **403 Forbidden** | Token doesn't have correct permissions | Recreate token with "Inference Providers" enabled |
| **Model loading** | AI model is starting up on the server | Wait 20–30 seconds and retry |
| **No documents found** | Knowledge base folder is empty | Add `.txt` files and reload |
| **Embedding error** | Issue converting text to numbers | Check token and selected model |

---

## 9. Key Takeaways

> [!TIP]
> **Why RAG matters:** Traditional AI models can "hallucinate" β€” make up information. RAG solves this by grounding AI answers in your actual documents, making it far more reliable for business and academic use.

- **RAG = Search + AI** β€” Combines document retrieval with AI generation
- **Your data stays private** β€” Documents are processed in your session only
- **Completely free** β€” No paid APIs, no GPU required
- **Customizable** β€” Swap models, tune chunk sizes, change the knowledge base anytime
- **Transparent** β€” Always shows which sources were used for each answer

---

*Report prepared for Makers Lab, SPJIMR β€” Term 3*