Spaces:

Al1Abdullah
/

rag-chatbot

Runtime error

App Files Files Community

rag-chatbot / README.md

Al1Abdullah

Update README.md

b82f537 verified about 2 months ago

preview code

raw

history blame contribute delete

8.66 kB

	---
	title: RAG Chatbot
	sdk: docker
	colorFrom: blue
	colorTo: green
	---
	# RAG Chatbot

	This is a Retrieval-Augmented Generation (RAG) chatbot I built to serve as a factual representative for atomcamp programs, courses, and admissions. Instead of relying on general AI knowledge, this system uses a FastAPI backend and LangChain to retrieve verified data from a Qdrant vector database, ensuring every response from the GPT-OSS-120B model is grounded in actual company information. By processing website data through Hugging Face embeddings and using a Maximal Marginal Relevance (MMR) retrieval strategy, the bot provides accurate, non-robotic answers without the "hallucinations" typical of standard AI.

	## Table of Contents

	1. Project Overview
	2. Interface Preview
	3. Core Technical Features
	4. System Architecture
	5. Tech Stack Specifications
	6. Installation and Environment Setup
	7. Configuration and API Integration
	8. Project Directory Structure
	9. Execution Workflow
	10. Key Component Breakdown
	11. Advanced UI/UX Implementation
	12. Status and Versioning

	-----

	## Project Overview

	The primary objective of this project is to eliminate the "robotic" nature of traditional AI assistants. By implementing a sophisticated RAG pipeline, the system grounds every response in verified atomcamp data. The architecture is designed for low latency, high relevance, and a professional user experience through a bespoke web interface that supports dynamic theme switching and responsive data rendering.

	-----

	## Interface Preview

	The platform provides a professional-grade interface with a custom-built theme toggle to support various user environments.

	\| Light Mode Interface \| Dark Mode Interface \|
	\| :--- \| :--- \|
	\| ![Light Mode Interface](image.png) \| ![Dark Mode Interface](image2.png) \|

	-----

	## Core Technical Features

	* Official Representative Persona: The system is programmed with a custom prompt that forces the LLM to "own" the knowledge, speaking as an authoritative human representative rather than a machine reading a file.
	* High-Speed Inference: Powered by the Groq LPU (Language Processing Unit) engine using the GPT-OSS-120B model for near-instantaneous responses.
	* Cloud-Native Vector Search: Utilizes a managed Qdrant collection for high-dimensional semantic search and retrieval.
	* Maximal Marginal Relevance (MMR): A specialized retrieval strategy that balances document relevance with information diversity to provide more comprehensive answers.
	* Dynamic Dark Mode: A fully integrated CSS variable system that swaps entire color palettes, including high-contrast link colors (Orange in Dark Mode) for maximum accessibility.
	* Auto-Expanding Interface: The input section utilizes an intelligent vertical-growth textarea that expands as the user types complex inquiries.
	* Markdown Integration: Full support for professional text formatting, including bold terms and standard bulleted lists.

	-----

	## System Architecture

	### The RAG Pipeline

	1. Data Ingestion: The system crawls the official atomcamp web domain, extracts core content using BeautifulSoup4, and splits it into semantic chunks.
	2. Embedding Generation: Chunks are converted into 768-dimensional vectors using the Hugging Face all-mpnet-base-v2 model.
	3. Vector Storage: Vectors are stored in a Qdrant Cloud collection with full metadata support.
	4. User Query: The user submits a question through the FastAPI web interface.
	5. Semantic Retrieval: The system performs a similarity search in Qdrant, retrieving the top contexts while applying MMR to reduce redundancy.
	6. Contextual Synthesis: The LLM processes the retrieved context, user question, and conversation history to generate a natural, authoritative response.

	-----

	## Tech Stack Specifications

	### Backend and Orchestration

	* Framework: FastAPI (v0.105.0) for high-performance API routing.
	* Orchestrator: LangChain (v0.3.0) for managing the RAG chain and memory.
	* Web Server: Uvicorn (v0.34.0) for asynchronous execution.

	### Artificial Intelligence

	* Inference Engine: Groq.
	* Model: openai/gpt-oss-120b.
	* Embeddings: sentence-transformers/all-mpnet-base-v2 via Hugging Face.

	### Data and Storage

	* Vector DB: Qdrant Cloud.
	* Web Scraping: BeautifulSoup4 and WebBaseLoader.

	-----

	## Installation and Environment Setup

	### 1\. Repository Initialization

	Clone the project and enter the application directory:

	```bash
	git clone <your-repository-url>
	cd RAG-Based-Chatbot-main/app
	```

	### 2\. Virtual Environment Creation

	It is highly recommended to use a virtual environment to manage dependencies:

	```bash
	python -m venv botenv
	# On Windows:
	botenv\Scripts\activate
	# On macOS/Linux:
	source botenv/bin/activate
	```

	### 3\. Dependency Installation

	Install the required packages listed in the requirements file:

	```bash
	pip install -r requirements.txt
	```

	-----

	## Configuration and API Integration

	The system requires an environment file to manage secure credentials. Create a file named `.env` in the `app/` directory:

	```env
	# Hugging Face Access Token
	HF_TOKEN=your_huggingface_token

	# Groq Cloud API Key
	GROQ_API_KEY=your_groq_api_key

	# Qdrant Cloud Credentials
	QDRANT_API_KEY=your_qdrant_api_key
	QDRANT_URL=your_qdrant_cloud_url
	```

	-----

	## Project Directory Structure

	The project follows a modular "Zero-Hurdle" root structure to ensure all paths are resolved correctly during execution:

	```text
	app/
	├── main.py # FastAPI server and API entry point
	├── chain.py # RAG logic and conversation memory
	├── ingest.py # Data crawling and vector ingestion
	├── .env # Private API keys
	├── requirements.txt # Project dependencies
	├── Prompt/
	│ └── Prompt.py # Custom representative persona
	├── LLM/
	│ └── LLM.py # Groq model configuration
	├── VectorStores/
	│ └── Vectorstores.py # Qdrant cloud connection logic
	├── embeddings/
	│ └── embedding.py # Hugging Face model setup
	├── config/
	│ └── config.py # Environment variable loader
	├── static/ # Professional UI/UX assets
	│ ├── css/
	│ │ └── style.css # Theme and layout styling
	│ ├── js/
	│ │ └── chat.js # Interaction and animation logic
	│ ├── templates/
	│ │ └── index.html # Web structure
	│ └── atomcamp_logo.png # Official organization logo
	└── Extras/ # If Document and Other resource
	```

	-----

	## Execution Workflow

	### Step 1: Knowledge Base Ingestion

	Before running the chatbot, you must populate the vector database with atomcamp's latest information:

	```bash
	python ingest.py
	```

	### Step 2: Launching the Platform

	Start the FastAPI server to initialize the RAG chain and the web interface:

	```bash
	python main.py
	```

	### Step 3: Accessing the UI

	Open your browser and navigate to the following address:
	`http://localhost:8000`

	-----

	## Key Component Breakdown

	### Advanced Prompt Engineering (Prompt.py)

	The core of the system's intelligence lies in its specialized persona. The prompt explicitly forbids the use of robotic disclaimers like "According to the context" or "The information provided state." It instructs the AI to combine duplicate facts into sophisticated, high-intelligence sentences and strictly enforce the lowercase atomcamp branding.

	### Retrieval Logic (chain.py)

	The system utilizes a `RunnableParallel` architecture. When a question is received, the chain simultaneously retrieves relevant documents from Qdrant, pulls the last several turns of conversation from memory, and prepares the final prompt for the LLM. This parallel execution minimizes response latency.

	-----

	## Advanced UI/UX Implementation

	* Professional Input Handling: The input bar is a custom textarea that supports `Enter` to send and `Shift + Enter` for new lines, behaving like a modern enterprise communication tool.
	* Vertical Container Stability: The chat bubbles use advanced word-break rules to ensure that long strings of code or characters never break the horizontal container width.
	* Thematic Link Management: Links are styled with a dynamic CSS variable (`--link-color`) that switches to a high-contrast orange in dark mode, ensuring email addresses and URLs are always readable.

	-----

	## Status and Versioning

	* Version: 1.5.0
	* Last Updated: July 2025
	* Status: Production Ready
	* Organization: atomcamp Official