Spaces:
Runtime error
title: RAG Chatbot
sdk: docker
colorFrom: blue
colorTo: green
RAG Chatbot
This is a Retrieval-Augmented Generation (RAG) chatbot I built to serve as a factual representative for atomcamp programs, courses, and admissions. Instead of relying on general AI knowledge, this system uses a FastAPI backend and LangChain to retrieve verified data from a Qdrant vector database, ensuring every response from the GPT-OSS-120B model is grounded in actual company information. By processing website data through Hugging Face embeddings and using a Maximal Marginal Relevance (MMR) retrieval strategy, the bot provides accurate, non-robotic answers without the "hallucinations" typical of standard AI.
Table of Contents
- Project Overview
- Interface Preview
- Core Technical Features
- System Architecture
- Tech Stack Specifications
- Installation and Environment Setup
- Configuration and API Integration
- Project Directory Structure
- Execution Workflow
- Key Component Breakdown
- Advanced UI/UX Implementation
- Status and Versioning
Project Overview
The primary objective of this project is to eliminate the "robotic" nature of traditional AI assistants. By implementing a sophisticated RAG pipeline, the system grounds every response in verified atomcamp data. The architecture is designed for low latency, high relevance, and a professional user experience through a bespoke web interface that supports dynamic theme switching and responsive data rendering.
Interface Preview
The platform provides a professional-grade interface with a custom-built theme toggle to support various user environments.
Core Technical Features
- Official Representative Persona: The system is programmed with a custom prompt that forces the LLM to "own" the knowledge, speaking as an authoritative human representative rather than a machine reading a file.
- High-Speed Inference: Powered by the Groq LPU (Language Processing Unit) engine using the GPT-OSS-120B model for near-instantaneous responses.
- Cloud-Native Vector Search: Utilizes a managed Qdrant collection for high-dimensional semantic search and retrieval.
- Maximal Marginal Relevance (MMR): A specialized retrieval strategy that balances document relevance with information diversity to provide more comprehensive answers.
- Dynamic Dark Mode: A fully integrated CSS variable system that swaps entire color palettes, including high-contrast link colors (Orange in Dark Mode) for maximum accessibility.
- Auto-Expanding Interface: The input section utilizes an intelligent vertical-growth textarea that expands as the user types complex inquiries.
- Markdown Integration: Full support for professional text formatting, including bold terms and standard bulleted lists.
System Architecture
The RAG Pipeline
- Data Ingestion: The system crawls the official atomcamp web domain, extracts core content using BeautifulSoup4, and splits it into semantic chunks.
- Embedding Generation: Chunks are converted into 768-dimensional vectors using the Hugging Face all-mpnet-base-v2 model.
- Vector Storage: Vectors are stored in a Qdrant Cloud collection with full metadata support.
- User Query: The user submits a question through the FastAPI web interface.
- Semantic Retrieval: The system performs a similarity search in Qdrant, retrieving the top contexts while applying MMR to reduce redundancy.
- Contextual Synthesis: The LLM processes the retrieved context, user question, and conversation history to generate a natural, authoritative response.
Tech Stack Specifications
Backend and Orchestration
- Framework: FastAPI (v0.105.0) for high-performance API routing.
- Orchestrator: LangChain (v0.3.0) for managing the RAG chain and memory.
- Web Server: Uvicorn (v0.34.0) for asynchronous execution.
Artificial Intelligence
- Inference Engine: Groq.
- Model: openai/gpt-oss-120b.
- Embeddings: sentence-transformers/all-mpnet-base-v2 via Hugging Face.
Data and Storage
- Vector DB: Qdrant Cloud.
- Web Scraping: BeautifulSoup4 and WebBaseLoader.
Installation and Environment Setup
1. Repository Initialization
Clone the project and enter the application directory:
git clone <your-repository-url>
cd RAG-Based-Chatbot-main/app
2. Virtual Environment Creation
It is highly recommended to use a virtual environment to manage dependencies:
python -m venv botenv
# On Windows:
botenv\Scripts\activate
# On macOS/Linux:
source botenv/bin/activate
3. Dependency Installation
Install the required packages listed in the requirements file:
pip install -r requirements.txt
Configuration and API Integration
The system requires an environment file to manage secure credentials. Create a file named .env in the app/ directory:
# Hugging Face Access Token
HF_TOKEN=your_huggingface_token
# Groq Cloud API Key
GROQ_API_KEY=your_groq_api_key
# Qdrant Cloud Credentials
QDRANT_API_KEY=your_qdrant_api_key
QDRANT_URL=your_qdrant_cloud_url
Project Directory Structure
The project follows a modular "Zero-Hurdle" root structure to ensure all paths are resolved correctly during execution:
app/
βββ main.py # FastAPI server and API entry point
βββ chain.py # RAG logic and conversation memory
βββ ingest.py # Data crawling and vector ingestion
βββ .env # Private API keys
βββ requirements.txt # Project dependencies
βββ Prompt/
β βββ Prompt.py # Custom representative persona
βββ LLM/
β βββ LLM.py # Groq model configuration
βββ VectorStores/
β βββ Vectorstores.py # Qdrant cloud connection logic
βββ embeddings/
β βββ embedding.py # Hugging Face model setup
βββ config/
β βββ config.py # Environment variable loader
βββ static/ # Professional UI/UX assets
β βββ css/
β β βββ style.css # Theme and layout styling
β βββ js/
β β βββ chat.js # Interaction and animation logic
β βββ templates/
β β βββ index.html # Web structure
β βββ atomcamp_logo.png # Official organization logo
βββ Extras/ # If Document and Other resource
Execution Workflow
Step 1: Knowledge Base Ingestion
Before running the chatbot, you must populate the vector database with atomcamp's latest information:
python ingest.py
Step 2: Launching the Platform
Start the FastAPI server to initialize the RAG chain and the web interface:
python main.py
Step 3: Accessing the UI
Open your browser and navigate to the following address:
http://localhost:8000
Key Component Breakdown
Advanced Prompt Engineering (Prompt.py)
The core of the system's intelligence lies in its specialized persona. The prompt explicitly forbids the use of robotic disclaimers like "According to the context" or "The information provided state." It instructs the AI to combine duplicate facts into sophisticated, high-intelligence sentences and strictly enforce the lowercase atomcamp branding.
Retrieval Logic (chain.py)
The system utilizes a RunnableParallel architecture. When a question is received, the chain simultaneously retrieves relevant documents from Qdrant, pulls the last several turns of conversation from memory, and prepares the final prompt for the LLM. This parallel execution minimizes response latency.
Advanced UI/UX Implementation
- Professional Input Handling: The input bar is a custom textarea that supports
Enterto send andShift + Enterfor new lines, behaving like a modern enterprise communication tool. - Vertical Container Stability: The chat bubbles use advanced word-break rules to ensure that long strings of code or characters never break the horizontal container width.
- Thematic Link Management: Links are styled with a dynamic CSS variable (
--link-color) that switches to a high-contrast orange in dark mode, ensuring email addresses and URLs are always readable.
Status and Versioning
- Version: 1.5.0
- Last Updated: July 2025
- Status: Production Ready
- Organization: atomcamp Official

