Spaces:

Kumaria
/

application-test-1

Sleeping

App Files Files Community

application-test-1 / Application_Report.md

Kumaria

Upload 2 files

01282eb verified about 1 month ago

preview code

raw

history blame contribute delete

7.52 kB

	# 🤖 RAG Chatbot — Application Report

	Course: Makers Lab (Term 3) \| Institute: SPJIMR
	Date: February 10, 2026

	---

	## 1. What Is This Application?

	This is an AI-powered chatbot that can answer questions by reading through your own documents. Instead of searching the internet, it looks through a personal "knowledge base" — a folder of text files you provide — and gives you accurate, sourced answers.

	Think of it like having a personal assistant who has read all your documents and can instantly recall information from them when you ask a question.

	---

	## 2. The Core Idea: RAG (Retrieval-Augmented Generation)

	RAG stands for Retrieval-Augmented Generation. In simple terms, it combines two steps:

	\| Step \| What Happens \| Analogy \|
	\|------\|-------------\|---------\|
	\| 1. Retrieval \| The system searches your documents and finds the most relevant paragraphs related to your question \| Like flipping through a textbook to find the right page \|
	\| 2. Generation \| An AI model reads those paragraphs and writes a clear, human-like answer \| Like a student summarizing what they found in their own words \|

	> [!IMPORTANT]
	> The AI only answers from your documents — it does not make things up or pull information from the internet. If the answer isn't in your files, it will tell you.

	---

	## 3. How It Works (Step by Step)

	```mermaid
	flowchart LR
	A["📄 Your Documents"] --> B["✂️ Split into Chunks"]
	B --> C["🔢 Convert to Numbers\n(Embeddings)"]
	C --> D["🗄️ Store in FAISS\n(Vector Database)"]
	E["❓ Your Question"] --> F["🔢 Convert to Numbers"]
	F --> G["🔍 Find Similar Chunks"]
	D --> G
	G --> H["🤖 AI Generates Answer"]
	H --> I["💬 Response Shown"]
	```

	### Breaking it down:

	1. You add documents — Place `.txt` files in the `knowledge_base` folder (e.g., company policies, notes, research papers)

	2. Documents are split — Large files are broken into smaller, manageable pieces called "chunks" (like cutting a book into individual pages)

	3. Chunks become numbers — Each chunk is converted into a list of numbers (called an "embedding") that captures its meaning. This is done by an Embedding Model running on HuggingFace's servers

	4. Numbers are stored — These numerical representations are saved in a FAISS database (a fast search engine for numbers)

	5. You ask a question — Your question is also converted into numbers the same way

	6. Similar chunks are found — The system compares your question's numbers with all the chunk numbers to find the closest matches (like finding the most relevant pages)

	7. AI writes the answer — The matching chunks are sent to a Language Model (LLM) which reads them and generates a clear, natural-language answer

	---

	## 4. Key Features

	### 📚 Custom Knowledge Base
	- Add any `.txt` files to the `knowledge_base` folder
	- Reload anytime using the sidebar button
	- Currently loaded with 6 documents (profile, experience, skills, projects, achievements, goals)

	### 🤖 Multiple AI Models
	The app lets you choose from different AI models:

	\| Model \| Best For \|
	\|-------\|----------\|
	\| Mistral 7B Instruct \| General-purpose, reliable \|
	\| Zephyr 7B \| Conversational, friendly \|
	\| Phi-3 Mini \| Fast, efficient \|
	\| Llama 3.2 3B \| Meta's latest compact model \|
	\| Gemma 2 2B \| Google's lightweight model \|

	### 🔍 Configurable Retrieval
	- Chunk Size (500–2000): Controls how big each document piece is
	- Number of Results (1–5): How many relevant pieces to retrieve

	### 📄 Source Citations
	Every answer includes an expandable section showing exactly which document fragments were used — so you can verify the answer.

	### ⚡ 100% Free
	All processing happens via HuggingFace's free Inference API — no paid subscriptions or expensive GPU hardware needed.

	### 💬 Chat History
	The app remembers your conversation, so you can ask follow-up questions naturally.

	---

	## 5. Technology Stack

	\| Component \| Technology \| Role \|
	\|-----------\|-----------\|------\|
	\| User Interface \| Streamlit \| Creates the web-based chat interface \|
	\| Document Loading \| LangChain \| Reads and processes text files \|
	\| Text Splitting \| RecursiveCharacterTextSplitter \| Breaks documents into chunks intelligently \|
	\| Embeddings \| HuggingFace API (e.g., all-MiniLM-L6-v2) \| Converts text into numerical representations \|
	\| Vector Database \| FAISS (Facebook AI Similarity Search) \| Stores and searches embeddings efficiently \|
	\| Answer Generation \| HuggingFace Inference API \| Runs the LLM to generate answers \|
	\| Environment Mgmt \| python-dotenv \| Manages configuration securely \|

	---

	## 6. How to Use the Application

	### First-Time Setup
	1. Get a free HuggingFace account at [huggingface.co](https://huggingface.co/join)
	2. Create a token at [Settings → Tokens](https://huggingface.co/settings/tokens)
	- Choose "Fine-grained" type
	- Enable "Make calls to Inference Providers"
	3. Install dependencies: `pip install -r requirements.txt`
	4. Add documents to the `knowledge_base/` folder

	### Running the App
	```
	streamlit run app.py
	```
	Then open `http://localhost:8501` in your browser.

	### Asking Questions
	1. Paste your HuggingFace token in the sidebar
	2. Wait for the knowledge base to load (green ✅ confirmation)
	3. Type your question in the chat box
	4. View the AI-generated answer and optionally expand source documents

	---

	## 7. Project File Structure

	```
	ApplicationTest1/
	├── app.py ← Main application (320 lines)
	├── requirements.txt ← Python package dependencies
	├── .env ← Stores your HuggingFace token
	├── README.md ← Quick-start guide
	├── knowledge_base/ ← Your documents go here
	│ ├── profile.txt
	│ ├── experience.txt
	│ ├── skills.txt
	│ ├── projects.txt
	│ ├── achievements.txt
	│ └── goals.txt
	└── venv_rag/ ← Python virtual environment
	```

	---

	## 8. Error Handling

	The application includes user-friendly error handling:

	\| Error \| What It Means \| Solution \|
	\|-------\|--------------\|----------\|
	\| 403 Forbidden \| Token doesn't have correct permissions \| Recreate token with "Inference Providers" enabled \|
	\| Model loading \| AI model is starting up on the server \| Wait 20–30 seconds and retry \|
	\| No documents found \| Knowledge base folder is empty \| Add `.txt` files and reload \|
	\| Embedding error \| Issue converting text to numbers \| Check token and selected model \|

	---

	## 9. Key Takeaways

	> [!TIP]
	> Why RAG matters: Traditional AI models can "hallucinate" — make up information. RAG solves this by grounding AI answers in your actual documents, making it far more reliable for business and academic use.

	- RAG = Search + AI — Combines document retrieval with AI generation
	- Your data stays private — Documents are processed in your session only
	- Completely free — No paid APIs, no GPU required
	- Customizable — Swap models, tune chunk sizes, change the knowledge base anytime
	- Transparent — Always shows which sources were used for each answer

	---

	Report prepared for Makers Lab, SPJIMR — Term 3