Spaces:

Al1Abdullah
/

rag-chatbot

Runtime error

App Files Files Community

rag-chatbot / README.md

Al1Abdullah

Update README.md

b82f537 verified about 2 months ago

preview code

raw

history blame contribute delete

8.66 kB

metadata

title: RAG Chatbot
sdk: docker
colorFrom: blue
colorTo: green

RAG Chatbot

This is a Retrieval-Augmented Generation (RAG) chatbot I built to serve as a factual representative for atomcamp programs, courses, and admissions. Instead of relying on general AI knowledge, this system uses a FastAPI backend and LangChain to retrieve verified data from a Qdrant vector database, ensuring every response from the GPT-OSS-120B model is grounded in actual company information. By processing website data through Hugging Face embeddings and using a Maximal Marginal Relevance (MMR) retrieval strategy, the bot provides accurate, non-robotic answers without the "hallucinations" typical of standard AI.

Project Overview
Interface Preview
Core Technical Features
System Architecture
Tech Stack Specifications
Installation and Environment Setup
Configuration and API Integration
Project Directory Structure
Execution Workflow
Key Component Breakdown
Advanced UI/UX Implementation
Status and Versioning

Project Overview

The primary objective of this project is to eliminate the "robotic" nature of traditional AI assistants. By implementing a sophisticated RAG pipeline, the system grounds every response in verified atomcamp data. The architecture is designed for low latency, high relevance, and a professional user experience through a bespoke web interface that supports dynamic theme switching and responsive data rendering.

Interface Preview

The platform provides a professional-grade interface with a custom-built theme toggle to support various user environments.

Light Mode Interface	Dark Mode Interface

Core Technical Features

Official Representative Persona: The system is programmed with a custom prompt that forces the LLM to "own" the knowledge, speaking as an authoritative human representative rather than a machine reading a file.
High-Speed Inference: Powered by the Groq LPU (Language Processing Unit) engine using the GPT-OSS-120B model for near-instantaneous responses.
Cloud-Native Vector Search: Utilizes a managed Qdrant collection for high-dimensional semantic search and retrieval.
Maximal Marginal Relevance (MMR): A specialized retrieval strategy that balances document relevance with information diversity to provide more comprehensive answers.
Dynamic Dark Mode: A fully integrated CSS variable system that swaps entire color palettes, including high-contrast link colors (Orange in Dark Mode) for maximum accessibility.
Auto-Expanding Interface: The input section utilizes an intelligent vertical-growth textarea that expands as the user types complex inquiries.
Markdown Integration: Full support for professional text formatting, including bold terms and standard bulleted lists.

System Architecture

The RAG Pipeline

Data Ingestion: The system crawls the official atomcamp web domain, extracts core content using BeautifulSoup4, and splits it into semantic chunks.
Embedding Generation: Chunks are converted into 768-dimensional vectors using the Hugging Face all-mpnet-base-v2 model.
Vector Storage: Vectors are stored in a Qdrant Cloud collection with full metadata support.
User Query: The user submits a question through the FastAPI web interface.
Semantic Retrieval: The system performs a similarity search in Qdrant, retrieving the top contexts while applying MMR to reduce redundancy.
Contextual Synthesis: The LLM processes the retrieved context, user question, and conversation history to generate a natural, authoritative response.

Tech Stack Specifications

Backend and Orchestration

Framework: FastAPI (v0.105.0) for high-performance API routing.
Orchestrator: LangChain (v0.3.0) for managing the RAG chain and memory.
Web Server: Uvicorn (v0.34.0) for asynchronous execution.

Artificial Intelligence

Inference Engine: Groq.
Model: openai/gpt-oss-120b.
Embeddings: sentence-transformers/all-mpnet-base-v2 via Hugging Face.

Data and Storage

Vector DB: Qdrant Cloud.
Web Scraping: BeautifulSoup4 and WebBaseLoader.

Installation and Environment Setup

1. Repository Initialization

Clone the project and enter the application directory:

git clone <your-repository-url>
cd RAG-Based-Chatbot-main/app

2. Virtual Environment Creation

It is highly recommended to use a virtual environment to manage dependencies:

python -m venv botenv
# On Windows:
botenv\Scripts\activate
# On macOS/Linux:
source botenv/bin/activate

3. Dependency Installation

Install the required packages listed in the requirements file:

pip install -r requirements.txt

Configuration and API Integration

The system requires an environment file to manage secure credentials. Create a file named .env in the app/ directory:

# Hugging Face Access Token
HF_TOKEN=your_huggingface_token

# Groq Cloud API Key
GROQ_API_KEY=your_groq_api_key

# Qdrant Cloud Credentials
QDRANT_API_KEY=your_qdrant_api_key
QDRANT_URL=your_qdrant_cloud_url

Project Directory Structure

The project follows a modular "Zero-Hurdle" root structure to ensure all paths are resolved correctly during execution:

app/
├── main.py                  # FastAPI server and API entry point
├── chain.py                 # RAG logic and conversation memory
├── ingest.py                # Data crawling and vector ingestion
├── .env                     # Private API keys
├── requirements.txt         # Project dependencies
├── Prompt/
│   └── Prompt.py            # Custom representative persona
├── LLM/
│   └── LLM.py               # Groq model configuration
├── VectorStores/
│   └── Vectorstores.py      # Qdrant cloud connection logic
├── embeddings/
│   └── embedding.py         # Hugging Face model setup
├── config/
│   └── config.py            # Environment variable loader
├── static/                  # Professional UI/UX assets
│   ├── css/
│   │   └── style.css        # Theme and layout styling
│   ├── js/
│   │   └── chat.js          # Interaction and animation logic
│   ├── templates/
│   │   └── index.html       # Web structure
│   └── atomcamp_logo.png    # Official organization logo
└── Extras/               # If Document and Other resource

Execution Workflow

Step 1: Knowledge Base Ingestion

Before running the chatbot, you must populate the vector database with atomcamp's latest information:

python ingest.py

Step 2: Launching the Platform

Start the FastAPI server to initialize the RAG chain and the web interface:

python main.py

Step 3: Accessing the UI

Open your browser and navigate to the following address: http://localhost:8000

Key Component Breakdown

Advanced Prompt Engineering (Prompt.py)

The core of the system's intelligence lies in its specialized persona. The prompt explicitly forbids the use of robotic disclaimers like "According to the context" or "The information provided state." It instructs the AI to combine duplicate facts into sophisticated, high-intelligence sentences and strictly enforce the lowercase atomcamp branding.

Retrieval Logic (chain.py)

The system utilizes a RunnableParallel architecture. When a question is received, the chain simultaneously retrieves relevant documents from Qdrant, pulls the last several turns of conversation from memory, and prepares the final prompt for the LLM. This parallel execution minimizes response latency.

Advanced UI/UX Implementation

Professional Input Handling: The input bar is a custom textarea that supports Enter to send and Shift + Enter for new lines, behaving like a modern enterprise communication tool.
Vertical Container Stability: The chat bubbles use advanced word-break rules to ensure that long strings of code or characters never break the horizontal container width.
Thematic Link Management: Links are styled with a dynamic CSS variable (--link-color) that switches to a high-contrast orange in dark mode, ensuring email addresses and URLs are always readable.

Status and Versioning

Version: 1.5.0
Last Updated: July 2025
Status: Production Ready
Organization: atomcamp Official