Spaces:
Sleeping
A newer version of the Streamlit SDK is available:
1.55.0
π€ RAG Chatbot β Application Report
Course: Makers Lab (Term 3) | Institute: SPJIMR
Date: February 10, 2026
1. What Is This Application?
This is an AI-powered chatbot that can answer questions by reading through your own documents. Instead of searching the internet, it looks through a personal "knowledge base" β a folder of text files you provide β and gives you accurate, sourced answers.
Think of it like having a personal assistant who has read all your documents and can instantly recall information from them when you ask a question.
2. The Core Idea: RAG (Retrieval-Augmented Generation)
RAG stands for Retrieval-Augmented Generation. In simple terms, it combines two steps:
| Step | What Happens | Analogy |
|---|---|---|
| 1. Retrieval | The system searches your documents and finds the most relevant paragraphs related to your question | Like flipping through a textbook to find the right page |
| 2. Generation | An AI model reads those paragraphs and writes a clear, human-like answer | Like a student summarizing what they found in their own words |
The AI only answers from your documents β it does not make things up or pull information from the internet. If the answer isn't in your files, it will tell you.
3. How It Works (Step by Step)
flowchart LR
A["π Your Documents"] --> B["βοΈ Split into Chunks"]
B --> C["π’ Convert to Numbers\n(Embeddings)"]
C --> D["ποΈ Store in FAISS\n(Vector Database)"]
E["β Your Question"] --> F["π’ Convert to Numbers"]
F --> G["π Find Similar Chunks"]
D --> G
G --> H["π€ AI Generates Answer"]
H --> I["π¬ Response Shown"]
Breaking it down:
You add documents β Place
.txtfiles in theknowledge_basefolder (e.g., company policies, notes, research papers)Documents are split β Large files are broken into smaller, manageable pieces called "chunks" (like cutting a book into individual pages)
Chunks become numbers β Each chunk is converted into a list of numbers (called an "embedding") that captures its meaning. This is done by an Embedding Model running on HuggingFace's servers
Numbers are stored β These numerical representations are saved in a FAISS database (a fast search engine for numbers)
You ask a question β Your question is also converted into numbers the same way
Similar chunks are found β The system compares your question's numbers with all the chunk numbers to find the closest matches (like finding the most relevant pages)
AI writes the answer β The matching chunks are sent to a Language Model (LLM) which reads them and generates a clear, natural-language answer
4. Key Features
π Custom Knowledge Base
- Add any
.txtfiles to theknowledge_basefolder - Reload anytime using the sidebar button
- Currently loaded with 6 documents (profile, experience, skills, projects, achievements, goals)
π€ Multiple AI Models
The app lets you choose from different AI models:
| Model | Best For |
|---|---|
| Mistral 7B Instruct | General-purpose, reliable |
| Zephyr 7B | Conversational, friendly |
| Phi-3 Mini | Fast, efficient |
| Llama 3.2 3B | Meta's latest compact model |
| Gemma 2 2B | Google's lightweight model |
π Configurable Retrieval
- Chunk Size (500β2000): Controls how big each document piece is
- Number of Results (1β5): How many relevant pieces to retrieve
π Source Citations
Every answer includes an expandable section showing exactly which document fragments were used β so you can verify the answer.
β‘ 100% Free
All processing happens via HuggingFace's free Inference API β no paid subscriptions or expensive GPU hardware needed.
π¬ Chat History
The app remembers your conversation, so you can ask follow-up questions naturally.
5. Technology Stack
| Component | Technology | Role |
|---|---|---|
| User Interface | Streamlit | Creates the web-based chat interface |
| Document Loading | LangChain | Reads and processes text files |
| Text Splitting | RecursiveCharacterTextSplitter | Breaks documents into chunks intelligently |
| Embeddings | HuggingFace API (e.g., all-MiniLM-L6-v2) | Converts text into numerical representations |
| Vector Database | FAISS (Facebook AI Similarity Search) | Stores and searches embeddings efficiently |
| Answer Generation | HuggingFace Inference API | Runs the LLM to generate answers |
| Environment Mgmt | python-dotenv | Manages configuration securely |
6. How to Use the Application
First-Time Setup
- Get a free HuggingFace account at huggingface.co
- Create a token at Settings β Tokens
- Choose "Fine-grained" type
- Enable "Make calls to Inference Providers"
- Install dependencies:
pip install -r requirements.txt - Add documents to the
knowledge_base/folder
Running the App
streamlit run app.py
Then open http://localhost:8501 in your browser.
Asking Questions
- Paste your HuggingFace token in the sidebar
- Wait for the knowledge base to load (green β confirmation)
- Type your question in the chat box
- View the AI-generated answer and optionally expand source documents
7. Project File Structure
ApplicationTest1/
βββ app.py β Main application (320 lines)
βββ requirements.txt β Python package dependencies
βββ .env β Stores your HuggingFace token
βββ README.md β Quick-start guide
βββ knowledge_base/ β Your documents go here
β βββ profile.txt
β βββ experience.txt
β βββ skills.txt
β βββ projects.txt
β βββ achievements.txt
β βββ goals.txt
βββ venv_rag/ β Python virtual environment
8. Error Handling
The application includes user-friendly error handling:
| Error | What It Means | Solution |
|---|---|---|
| 403 Forbidden | Token doesn't have correct permissions | Recreate token with "Inference Providers" enabled |
| Model loading | AI model is starting up on the server | Wait 20β30 seconds and retry |
| No documents found | Knowledge base folder is empty | Add .txt files and reload |
| Embedding error | Issue converting text to numbers | Check token and selected model |
9. Key Takeaways
Why RAG matters: Traditional AI models can "hallucinate" β make up information. RAG solves this by grounding AI answers in your actual documents, making it far more reliable for business and academic use.
- RAG = Search + AI β Combines document retrieval with AI generation
- Your data stays private β Documents are processed in your session only
- Completely free β No paid APIs, no GPU required
- Customizable β Swap models, tune chunk sizes, change the knowledge base anytime
- Transparent β Always shows which sources were used for each answer
Report prepared for Makers Lab, SPJIMR β Term 3