Spaces:
Sleeping
Sleeping
| # π€ RAG Chatbot β Application Report | |
| **Course:** Makers Lab (Term 3) | **Institute:** SPJIMR | |
| **Date:** February 10, 2026 | |
| --- | |
| ## 1. What Is This Application? | |
| This is an **AI-powered chatbot** that can answer questions by reading through your own documents. Instead of searching the internet, it looks through a personal "knowledge base" β a folder of text files you provide β and gives you accurate, sourced answers. | |
| Think of it like having a **personal assistant who has read all your documents** and can instantly recall information from them when you ask a question. | |
| --- | |
| ## 2. The Core Idea: RAG (Retrieval-Augmented Generation) | |
| **RAG** stands for **Retrieval-Augmented Generation**. In simple terms, it combines two steps: | |
| | Step | What Happens | Analogy | | |
| |------|-------------|---------| | |
| | **1. Retrieval** | The system searches your documents and finds the most relevant paragraphs related to your question | Like flipping through a textbook to find the right page | | |
| | **2. Generation** | An AI model reads those paragraphs and writes a clear, human-like answer | Like a student summarizing what they found in their own words | | |
| > [!IMPORTANT] | |
| > The AI **only answers from your documents** β it does not make things up or pull information from the internet. If the answer isn't in your files, it will tell you. | |
| --- | |
| ## 3. How It Works (Step by Step) | |
| ```mermaid | |
| flowchart LR | |
| A["π Your Documents"] --> B["βοΈ Split into Chunks"] | |
| B --> C["π’ Convert to Numbers\n(Embeddings)"] | |
| C --> D["ποΈ Store in FAISS\n(Vector Database)"] | |
| E["β Your Question"] --> F["π’ Convert to Numbers"] | |
| F --> G["π Find Similar Chunks"] | |
| D --> G | |
| G --> H["π€ AI Generates Answer"] | |
| H --> I["π¬ Response Shown"] | |
| ``` | |
| ### Breaking it down: | |
| 1. **You add documents** β Place `.txt` files in the `knowledge_base` folder (e.g., company policies, notes, research papers) | |
| 2. **Documents are split** β Large files are broken into smaller, manageable pieces called "chunks" (like cutting a book into individual pages) | |
| 3. **Chunks become numbers** β Each chunk is converted into a list of numbers (called an "embedding") that captures its meaning. This is done by an **Embedding Model** running on HuggingFace's servers | |
| 4. **Numbers are stored** β These numerical representations are saved in a **FAISS database** (a fast search engine for numbers) | |
| 5. **You ask a question** β Your question is also converted into numbers the same way | |
| 6. **Similar chunks are found** β The system compares your question's numbers with all the chunk numbers to find the closest matches (like finding the most relevant pages) | |
| 7. **AI writes the answer** β The matching chunks are sent to a **Language Model (LLM)** which reads them and generates a clear, natural-language answer | |
| --- | |
| ## 4. Key Features | |
| ### π Custom Knowledge Base | |
| - Add any `.txt` files to the `knowledge_base` folder | |
| - Reload anytime using the sidebar button | |
| - Currently loaded with 6 documents (profile, experience, skills, projects, achievements, goals) | |
| ### π€ Multiple AI Models | |
| The app lets you choose from different AI models: | |
| | Model | Best For | | |
| |-------|----------| | |
| | Mistral 7B Instruct | General-purpose, reliable | | |
| | Zephyr 7B | Conversational, friendly | | |
| | Phi-3 Mini | Fast, efficient | | |
| | Llama 3.2 3B | Meta's latest compact model | | |
| | Gemma 2 2B | Google's lightweight model | | |
| ### π Configurable Retrieval | |
| - **Chunk Size** (500β2000): Controls how big each document piece is | |
| - **Number of Results** (1β5): How many relevant pieces to retrieve | |
| ### π Source Citations | |
| Every answer includes an expandable section showing exactly which document fragments were used β so you can verify the answer. | |
| ### β‘ 100% Free | |
| All processing happens via HuggingFace's free Inference API β no paid subscriptions or expensive GPU hardware needed. | |
| ### π¬ Chat History | |
| The app remembers your conversation, so you can ask follow-up questions naturally. | |
| --- | |
| ## 5. Technology Stack | |
| | Component | Technology | Role | | |
| |-----------|-----------|------| | |
| | User Interface | **Streamlit** | Creates the web-based chat interface | | |
| | Document Loading | **LangChain** | Reads and processes text files | | |
| | Text Splitting | **RecursiveCharacterTextSplitter** | Breaks documents into chunks intelligently | | |
| | Embeddings | **HuggingFace API** (e.g., all-MiniLM-L6-v2) | Converts text into numerical representations | | |
| | Vector Database | **FAISS** (Facebook AI Similarity Search) | Stores and searches embeddings efficiently | | |
| | Answer Generation | **HuggingFace Inference API** | Runs the LLM to generate answers | | |
| | Environment Mgmt | **python-dotenv** | Manages configuration securely | | |
| --- | |
| ## 6. How to Use the Application | |
| ### First-Time Setup | |
| 1. **Get a free HuggingFace account** at [huggingface.co](https://huggingface.co/join) | |
| 2. **Create a token** at [Settings β Tokens](https://huggingface.co/settings/tokens) | |
| - Choose **"Fine-grained"** type | |
| - Enable **"Make calls to Inference Providers"** | |
| 3. **Install dependencies**: `pip install -r requirements.txt` | |
| 4. **Add documents** to the `knowledge_base/` folder | |
| ### Running the App | |
| ``` | |
| streamlit run app.py | |
| ``` | |
| Then open `http://localhost:8501` in your browser. | |
| ### Asking Questions | |
| 1. Paste your HuggingFace token in the sidebar | |
| 2. Wait for the knowledge base to load (green β confirmation) | |
| 3. Type your question in the chat box | |
| 4. View the AI-generated answer and optionally expand source documents | |
| --- | |
| ## 7. Project File Structure | |
| ``` | |
| ApplicationTest1/ | |
| βββ app.py β Main application (320 lines) | |
| βββ requirements.txt β Python package dependencies | |
| βββ .env β Stores your HuggingFace token | |
| βββ README.md β Quick-start guide | |
| βββ knowledge_base/ β Your documents go here | |
| β βββ profile.txt | |
| β βββ experience.txt | |
| β βββ skills.txt | |
| β βββ projects.txt | |
| β βββ achievements.txt | |
| β βββ goals.txt | |
| βββ venv_rag/ β Python virtual environment | |
| ``` | |
| --- | |
| ## 8. Error Handling | |
| The application includes user-friendly error handling: | |
| | Error | What It Means | Solution | | |
| |-------|--------------|----------| | |
| | **403 Forbidden** | Token doesn't have correct permissions | Recreate token with "Inference Providers" enabled | | |
| | **Model loading** | AI model is starting up on the server | Wait 20β30 seconds and retry | | |
| | **No documents found** | Knowledge base folder is empty | Add `.txt` files and reload | | |
| | **Embedding error** | Issue converting text to numbers | Check token and selected model | | |
| --- | |
| ## 9. Key Takeaways | |
| > [!TIP] | |
| > **Why RAG matters:** Traditional AI models can "hallucinate" β make up information. RAG solves this by grounding AI answers in your actual documents, making it far more reliable for business and academic use. | |
| - **RAG = Search + AI** β Combines document retrieval with AI generation | |
| - **Your data stays private** β Documents are processed in your session only | |
| - **Completely free** β No paid APIs, no GPU required | |
| - **Customizable** β Swap models, tune chunk sizes, change the knowledge base anytime | |
| - **Transparent** β Always shows which sources were used for each answer | |
| --- | |
| *Report prepared for Makers Lab, SPJIMR β Term 3* | |