Spaces:
Runtime error
Runtime error
| title: RAG Chatbot | |
| sdk: docker | |
| colorFrom: blue | |
| colorTo: green | |
| # RAG Chatbot | |
| This is a Retrieval-Augmented Generation (RAG) chatbot I built to serve as a factual representative for **atomcamp** programs, courses, and admissions. Instead of relying on general AI knowledge, this system uses a **FastAPI** backend and **LangChain** to retrieve verified data from a **Qdrant** vector database, ensuring every response from the **GPT-OSS-120B** model is grounded in actual company information. By processing website data through **Hugging Face** embeddings and using a **Maximal Marginal Relevance (MMR)** retrieval strategy, the bot provides accurate, non-robotic answers without the "hallucinations" typical of standard AI. | |
| ## Table of Contents | |
| 1. Project Overview | |
| 2. Interface Preview | |
| 3. Core Technical Features | |
| 4. System Architecture | |
| 5. Tech Stack Specifications | |
| 6. Installation and Environment Setup | |
| 7. Configuration and API Integration | |
| 8. Project Directory Structure | |
| 9. Execution Workflow | |
| 10. Key Component Breakdown | |
| 11. Advanced UI/UX Implementation | |
| 12. Status and Versioning | |
| ----- | |
| ## Project Overview | |
| The primary objective of this project is to eliminate the "robotic" nature of traditional AI assistants. By implementing a sophisticated RAG pipeline, the system grounds every response in verified atomcamp data. The architecture is designed for low latency, high relevance, and a professional user experience through a bespoke web interface that supports dynamic theme switching and responsive data rendering. | |
| ----- | |
| ## Interface Preview | |
| The platform provides a professional-grade interface with a custom-built theme toggle to support various user environments. | |
| | Light Mode Interface | Dark Mode Interface | | |
| | :--- | :--- | | |
| |  |  | | |
| ----- | |
| ## Core Technical Features | |
| * Official Representative Persona: The system is programmed with a custom prompt that forces the LLM to "own" the knowledge, speaking as an authoritative human representative rather than a machine reading a file. | |
| * High-Speed Inference: Powered by the Groq LPU (Language Processing Unit) engine using the GPT-OSS-120B model for near-instantaneous responses. | |
| * Cloud-Native Vector Search: Utilizes a managed Qdrant collection for high-dimensional semantic search and retrieval. | |
| * Maximal Marginal Relevance (MMR): A specialized retrieval strategy that balances document relevance with information diversity to provide more comprehensive answers. | |
| * Dynamic Dark Mode: A fully integrated CSS variable system that swaps entire color palettes, including high-contrast link colors (Orange in Dark Mode) for maximum accessibility. | |
| * Auto-Expanding Interface: The input section utilizes an intelligent vertical-growth textarea that expands as the user types complex inquiries. | |
| * Markdown Integration: Full support for professional text formatting, including bold terms and standard bulleted lists. | |
| ----- | |
| ## System Architecture | |
| ### The RAG Pipeline | |
| 1. Data Ingestion: The system crawls the official atomcamp web domain, extracts core content using BeautifulSoup4, and splits it into semantic chunks. | |
| 2. Embedding Generation: Chunks are converted into 768-dimensional vectors using the Hugging Face all-mpnet-base-v2 model. | |
| 3. Vector Storage: Vectors are stored in a Qdrant Cloud collection with full metadata support. | |
| 4. User Query: The user submits a question through the FastAPI web interface. | |
| 5. Semantic Retrieval: The system performs a similarity search in Qdrant, retrieving the top contexts while applying MMR to reduce redundancy. | |
| 6. Contextual Synthesis: The LLM processes the retrieved context, user question, and conversation history to generate a natural, authoritative response. | |
| ----- | |
| ## Tech Stack Specifications | |
| ### Backend and Orchestration | |
| * Framework: FastAPI (v0.105.0) for high-performance API routing. | |
| * Orchestrator: LangChain (v0.3.0) for managing the RAG chain and memory. | |
| * Web Server: Uvicorn (v0.34.0) for asynchronous execution. | |
| ### Artificial Intelligence | |
| * Inference Engine: Groq. | |
| * Model: openai/gpt-oss-120b. | |
| * Embeddings: sentence-transformers/all-mpnet-base-v2 via Hugging Face. | |
| ### Data and Storage | |
| * Vector DB: Qdrant Cloud. | |
| * Web Scraping: BeautifulSoup4 and WebBaseLoader. | |
| ----- | |
| ## Installation and Environment Setup | |
| ### 1\. Repository Initialization | |
| Clone the project and enter the application directory: | |
| ```bash | |
| git clone <your-repository-url> | |
| cd RAG-Based-Chatbot-main/app | |
| ``` | |
| ### 2\. Virtual Environment Creation | |
| It is highly recommended to use a virtual environment to manage dependencies: | |
| ```bash | |
| python -m venv botenv | |
| # On Windows: | |
| botenv\Scripts\activate | |
| # On macOS/Linux: | |
| source botenv/bin/activate | |
| ``` | |
| ### 3\. Dependency Installation | |
| Install the required packages listed in the requirements file: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ----- | |
| ## Configuration and API Integration | |
| The system requires an environment file to manage secure credentials. Create a file named `.env` in the `app/` directory: | |
| ```env | |
| # Hugging Face Access Token | |
| HF_TOKEN=your_huggingface_token | |
| # Groq Cloud API Key | |
| GROQ_API_KEY=your_groq_api_key | |
| # Qdrant Cloud Credentials | |
| QDRANT_API_KEY=your_qdrant_api_key | |
| QDRANT_URL=your_qdrant_cloud_url | |
| ``` | |
| ----- | |
| ## Project Directory Structure | |
| The project follows a modular "Zero-Hurdle" root structure to ensure all paths are resolved correctly during execution: | |
| ```text | |
| app/ | |
| βββ main.py # FastAPI server and API entry point | |
| βββ chain.py # RAG logic and conversation memory | |
| βββ ingest.py # Data crawling and vector ingestion | |
| βββ .env # Private API keys | |
| βββ requirements.txt # Project dependencies | |
| βββ Prompt/ | |
| β βββ Prompt.py # Custom representative persona | |
| βββ LLM/ | |
| β βββ LLM.py # Groq model configuration | |
| βββ VectorStores/ | |
| β βββ Vectorstores.py # Qdrant cloud connection logic | |
| βββ embeddings/ | |
| β βββ embedding.py # Hugging Face model setup | |
| βββ config/ | |
| β βββ config.py # Environment variable loader | |
| βββ static/ # Professional UI/UX assets | |
| β βββ css/ | |
| β β βββ style.css # Theme and layout styling | |
| β βββ js/ | |
| β β βββ chat.js # Interaction and animation logic | |
| β βββ templates/ | |
| β β βββ index.html # Web structure | |
| β βββ atomcamp_logo.png # Official organization logo | |
| βββ Extras/ # If Document and Other resource | |
| ``` | |
| ----- | |
| ## Execution Workflow | |
| ### Step 1: Knowledge Base Ingestion | |
| Before running the chatbot, you must populate the vector database with atomcamp's latest information: | |
| ```bash | |
| python ingest.py | |
| ``` | |
| ### Step 2: Launching the Platform | |
| Start the FastAPI server to initialize the RAG chain and the web interface: | |
| ```bash | |
| python main.py | |
| ``` | |
| ### Step 3: Accessing the UI | |
| Open your browser and navigate to the following address: | |
| `http://localhost:8000` | |
| ----- | |
| ## Key Component Breakdown | |
| ### Advanced Prompt Engineering (Prompt.py) | |
| The core of the system's intelligence lies in its specialized persona. The prompt explicitly forbids the use of robotic disclaimers like "According to the context" or "The information provided state." It instructs the AI to combine duplicate facts into sophisticated, high-intelligence sentences and strictly enforce the lowercase **atomcamp** branding. | |
| ### Retrieval Logic (chain.py) | |
| The system utilizes a `RunnableParallel` architecture. When a question is received, the chain simultaneously retrieves relevant documents from Qdrant, pulls the last several turns of conversation from memory, and prepares the final prompt for the LLM. This parallel execution minimizes response latency. | |
| ----- | |
| ## Advanced UI/UX Implementation | |
| * Professional Input Handling: The input bar is a custom textarea that supports `Enter` to send and `Shift + Enter` for new lines, behaving like a modern enterprise communication tool. | |
| * Vertical Container Stability: The chat bubbles use advanced word-break rules to ensure that long strings of code or characters never break the horizontal container width. | |
| * Thematic Link Management: Links are styled with a dynamic CSS variable (`--link-color`) that switches to a high-contrast orange in dark mode, ensuring email addresses and URLs are always readable. | |
| ----- | |
| ## Status and Versioning | |
| * Version: 1.5.0 | |
| * Last Updated: July 2025 | |
| * Status: Production Ready | |
| * Organization: atomcamp Official |