application-test-1 / Application_Report.md
Kumaria's picture
Upload 2 files
01282eb verified

A newer version of the Streamlit SDK is available: 1.55.0

Upgrade

πŸ€– RAG Chatbot β€” Application Report

Course: Makers Lab (Term 3) | Institute: SPJIMR
Date: February 10, 2026


1. What Is This Application?

This is an AI-powered chatbot that can answer questions by reading through your own documents. Instead of searching the internet, it looks through a personal "knowledge base" β€” a folder of text files you provide β€” and gives you accurate, sourced answers.

Think of it like having a personal assistant who has read all your documents and can instantly recall information from them when you ask a question.


2. The Core Idea: RAG (Retrieval-Augmented Generation)

RAG stands for Retrieval-Augmented Generation. In simple terms, it combines two steps:

Step What Happens Analogy
1. Retrieval The system searches your documents and finds the most relevant paragraphs related to your question Like flipping through a textbook to find the right page
2. Generation An AI model reads those paragraphs and writes a clear, human-like answer Like a student summarizing what they found in their own words

The AI only answers from your documents β€” it does not make things up or pull information from the internet. If the answer isn't in your files, it will tell you.


3. How It Works (Step by Step)

flowchart LR
    A["πŸ“„ Your Documents"] --> B["βœ‚οΈ Split into Chunks"]
    B --> C["πŸ”’ Convert to Numbers\n(Embeddings)"]
    C --> D["πŸ—„οΈ Store in FAISS\n(Vector Database)"]
    E["❓ Your Question"] --> F["πŸ”’ Convert to Numbers"]
    F --> G["πŸ” Find Similar Chunks"]
    D --> G
    G --> H["πŸ€– AI Generates Answer"]
    H --> I["πŸ’¬ Response Shown"]

Breaking it down:

  1. You add documents β€” Place .txt files in the knowledge_base folder (e.g., company policies, notes, research papers)

  2. Documents are split β€” Large files are broken into smaller, manageable pieces called "chunks" (like cutting a book into individual pages)

  3. Chunks become numbers β€” Each chunk is converted into a list of numbers (called an "embedding") that captures its meaning. This is done by an Embedding Model running on HuggingFace's servers

  4. Numbers are stored β€” These numerical representations are saved in a FAISS database (a fast search engine for numbers)

  5. You ask a question β€” Your question is also converted into numbers the same way

  6. Similar chunks are found β€” The system compares your question's numbers with all the chunk numbers to find the closest matches (like finding the most relevant pages)

  7. AI writes the answer β€” The matching chunks are sent to a Language Model (LLM) which reads them and generates a clear, natural-language answer


4. Key Features

πŸ“š Custom Knowledge Base

  • Add any .txt files to the knowledge_base folder
  • Reload anytime using the sidebar button
  • Currently loaded with 6 documents (profile, experience, skills, projects, achievements, goals)

πŸ€– Multiple AI Models

The app lets you choose from different AI models:

Model Best For
Mistral 7B Instruct General-purpose, reliable
Zephyr 7B Conversational, friendly
Phi-3 Mini Fast, efficient
Llama 3.2 3B Meta's latest compact model
Gemma 2 2B Google's lightweight model

πŸ” Configurable Retrieval

  • Chunk Size (500–2000): Controls how big each document piece is
  • Number of Results (1–5): How many relevant pieces to retrieve

πŸ“„ Source Citations

Every answer includes an expandable section showing exactly which document fragments were used β€” so you can verify the answer.

⚑ 100% Free

All processing happens via HuggingFace's free Inference API β€” no paid subscriptions or expensive GPU hardware needed.

πŸ’¬ Chat History

The app remembers your conversation, so you can ask follow-up questions naturally.


5. Technology Stack

Component Technology Role
User Interface Streamlit Creates the web-based chat interface
Document Loading LangChain Reads and processes text files
Text Splitting RecursiveCharacterTextSplitter Breaks documents into chunks intelligently
Embeddings HuggingFace API (e.g., all-MiniLM-L6-v2) Converts text into numerical representations
Vector Database FAISS (Facebook AI Similarity Search) Stores and searches embeddings efficiently
Answer Generation HuggingFace Inference API Runs the LLM to generate answers
Environment Mgmt python-dotenv Manages configuration securely

6. How to Use the Application

First-Time Setup

  1. Get a free HuggingFace account at huggingface.co
  2. Create a token at Settings β†’ Tokens
    • Choose "Fine-grained" type
    • Enable "Make calls to Inference Providers"
  3. Install dependencies: pip install -r requirements.txt
  4. Add documents to the knowledge_base/ folder

Running the App

streamlit run app.py

Then open http://localhost:8501 in your browser.

Asking Questions

  1. Paste your HuggingFace token in the sidebar
  2. Wait for the knowledge base to load (green βœ… confirmation)
  3. Type your question in the chat box
  4. View the AI-generated answer and optionally expand source documents

7. Project File Structure

ApplicationTest1/
β”œβ”€β”€ app.py                  ← Main application (320 lines)
β”œβ”€β”€ requirements.txt        ← Python package dependencies
β”œβ”€β”€ .env                    ← Stores your HuggingFace token
β”œβ”€β”€ README.md               ← Quick-start guide
β”œβ”€β”€ knowledge_base/         ← Your documents go here
β”‚   β”œβ”€β”€ profile.txt
β”‚   β”œβ”€β”€ experience.txt
β”‚   β”œβ”€β”€ skills.txt
β”‚   β”œβ”€β”€ projects.txt
β”‚   β”œβ”€β”€ achievements.txt
β”‚   └── goals.txt
└── venv_rag/               ← Python virtual environment

8. Error Handling

The application includes user-friendly error handling:

Error What It Means Solution
403 Forbidden Token doesn't have correct permissions Recreate token with "Inference Providers" enabled
Model loading AI model is starting up on the server Wait 20–30 seconds and retry
No documents found Knowledge base folder is empty Add .txt files and reload
Embedding error Issue converting text to numbers Check token and selected model

9. Key Takeaways

Why RAG matters: Traditional AI models can "hallucinate" β€” make up information. RAG solves this by grounding AI answers in your actual documents, making it far more reliable for business and academic use.

  • RAG = Search + AI β€” Combines document retrieval with AI generation
  • Your data stays private β€” Documents are processed in your session only
  • Completely free β€” No paid APIs, no GPU required
  • Customizable β€” Swap models, tune chunk sizes, change the knowledge base anytime
  • Transparent β€” Always shows which sources were used for each answer

Report prepared for Makers Lab, SPJIMR β€” Term 3