--- title: HR Intervals Chatbot emoji: πΌ colorFrom: blue colorTo: purple sdk: gradio sdk_version: 5.49.0 app_file: app.py pinned: false --- # HR Intervals AI Assistant A **RAG-powered chatbot** that provides HR knowledge and policy guidance for non-profit organizations. Built with LangChain, OpenAI, Qdrant, Firecrawl, and Gradio β with optional LangSmith observability. --- ## Table of Contents - [Features](#features) - [Architecture Overview](#architecture-overview) - [Project Structure](#project-structure) - [Prerequisites](#prerequisites) - [Installation](#installation) - [Configuration](#configuration) - [OpenAI](#openai) - [Qdrant](#qdrant) - [Firecrawl](#firecrawl) - [LangSmith](#langsmith) - [Deployment Model](#deployment-model) - [Running the Application](#running-the-application) - [Admin Interface](#admin-interface) - [How It Works](#how-it-works) - [Embedding the Chatbot Widget](#embedding-the-chatbot-widget) - [Testing](#testing) - [Troubleshooting](#troubleshooting) --- ## Features - **AI-powered Q&A** β Answers HR questions using a retrieval-augmented generation (RAG) pipeline - **Source citations** β Every answer includes references to the documents it was derived from - **Web scraping** β Ingest web pages into the knowledge base via Firecrawl - **Document ingestion** β Upload PDF and DOCX files through the admin interface - **PII detection** β Warns users when personally identifiable information is detected in queries - **Conversation memory** β Maintains context across multi-turn conversations with session-based history - **Multi-query retrieval** β Expands user queries for better document matching - **Admin dashboard** β Manage documents, scrape URLs, and monitor the knowledge base - **Observability (optional)** β Trace and debug the full RAG pipeline with LangSmith --- ## Architecture Overview ``` ββββββββββββββ ββββββββββββββββββ ββββββββββββββββ β Gradio ββββββββΆβ RAG Chain ββββββββΆβ OpenAI β β (User) β β (LangChain) β β GPT-4o β ββββββββββββββ βββββββββ¬βββββββββ ββββββββββββββββ β βββββββββΌβββββββββ β Qdrant β β Vector Store β βββββββββ²βββββββββ β ββββββββββββββββ΄βββββββββββββββ β β ββββββββββΌββββββββ ββββββββββββΌββββββββββ β PDF / DOCX β β Firecrawl β β Ingestion β β Web Scraping β ββββββββββββββββββ βββββββββββββββββββββ-β ``` **Data flow:** 1. Documents (PDF, DOCX) and web pages are ingested, chunked, and embedded using OpenAI embeddings. 2. Embeddings are stored in a Qdrant vector database. 3. When a user asks a question, the query is expanded via MultiQueryRetriever, matched against stored embeddings, and the top-k results are used as context. 4. OpenAI GPT-4o generates a grounded answer using the retrieved context. 5. (Optional) The entire pipeline is traced and logged in LangSmith. --- ## Project Structure ``` hr-intervals-chatbot/ βββ app.py # User-facing chat interface (Gradio, port 7860) βββ admin.py # Admin dashboard (Gradio, port 7861) βββ chatbot-widget.html # Embeddable HTML widget for the chatbot βββ requirements.txt # Python dependencies βββ .env # Environment variables (not committed) βββ src/ β βββ __init__.py β βββ chatbot.py # RAG chain construction and query handling β βββ ingestion.py # PDF/DOCX document processing β βββ scraper.py # Firecrawl web scraping β βββ vector_store.py # Qdrant vector store utilities βββ tests/ βββ test_connections.py # API connection verification tests ``` --- ## Prerequisites - **Python 3.10+** - An **OpenAI** account with API access - A **Qdrant Cloud** account (or self-hosted Qdrant instance) - A **Firecrawl** account for web scraping - (Optional) A **LangSmith** account for observability --- ## Installation 1. **Clone the repository:** ```bash https://huggingface.co/spaces/pikamomo/hr-intervals-chatbot/tree/main ``` 2. **Create and activate a virtual environment:** ```bash python -m venv venv # Windows venv\Scripts\activate # macOS / Linux source venv/bin/activate ``` 3. **Install dependencies:** ```bash pip install -r requirements.txt ``` 4. **Create your `.env` file** (see [Configuration](#configuration) below): ```bash OPENAI_API_KEY="" OPEN_AI_EMBEDDING_MODEL="" OPEN_AI_CHAT_MODEL="" QDRANT_URL="" QDRANT_API_KEY="" QDRANT_COLLECTION="" LANGSMITH_TRACING="" LANGSMITH_ENDPOINT="" LANGSMITH_API_KEY="" LANGSMITH_PROJECT="" FIRECRAWL_API_KEY="" ``` --- ## Configuration Create a `.env` file in the project root with the following variables. All services are configured exclusively through environment variables. ### OpenAI OpenAI powers both the **embedding model** (for converting text into vectors) and the **chat model** (for generating answers). | Variable | Required | Default | Description | | ------------------------- | -------- | ------------------------ | ----------------------------------------------------- | | `OPENAI_API_KEY` | Yes | β | Your OpenAI API key | | `OPEN_AI_EMBEDDING_MODEL` | No | `text-embedding-3-small` | Embedding model for vectorizing documents and queries | | `OPEN_AI_CHAT_MODEL` | No | `gpt-4o` | Chat model for generating RAG answers | **How it's used:** - **Embeddings** (`text-embedding-3-small`): Every document chunk and user query is embedded using this model before being stored in or searched against Qdrant. The `OpenAIEmbeddings` class from `langchain-openai` handles this. - **Chat completions** (`gpt-4o`): After relevant document chunks are retrieved, they are passed as context to the chat model along with a system prompt that instructs it to act as an HR assistant. The model generates answers at `temperature=0.3` for factual consistency. ```dotenv OPENAI_API_KEY="sk-proj-your-key-here" OPEN_AI_EMBEDDING_MODEL=text-embedding-3-small OPEN_AI_CHAT_MODEL=gpt-4o ``` > **Tip:** You can switch to `text-embedding-3-large` for higher quality embeddings or `gpt-4o-mini` for lower-cost chat completions. If you change the embedding model, you must re-ingest all documents since vector dimensions will differ. --- ### Qdrant Qdrant is the **vector database** that stores document embeddings and enables similarity search. | Variable | Required | Default | Description | | ------------------- | -------- | -------------- | -------------------------------------------------- | | `QDRANT_URL` | Yes | β | URL of your Qdrant instance (cloud or self-hosted) | | `QDRANT_API_KEY` | Yes | β | API key for authenticating with Qdrant | | `QDRANT_COLLECTION` | No | `hr-intervals` | Name of the vector collection | **How it's used:** - **Storage**: Document chunks are embedded and stored as points in a Qdrant collection. Each point contains the vector embedding plus metadata (`source`, `type`, `upload_date`, etc.). - **Retrieval**: When a user asks a question, the query is embedded and a cosine similarity search retrieves the top 8 most relevant chunks. - **Management**: The admin interface directly interacts with Qdrant to list, filter, and delete documents by source. **Setting up Qdrant Cloud:** 1. Sign up at [cloud.qdrant.io](https://cloud.qdrant.io) 2. Create a new cluster (the free tier works for development) 3. Create an API key from the cluster dashboard 4. Copy the cluster URL and API key into your `.env` ```dotenv QDRANT_URL="https://your-cluster-id.region.cloud.qdrant.io:6333" QDRANT_API_KEY="your-qdrant-api-key" QDRANT_COLLECTION="hr-intervals" ``` > **Note:** The collection is created automatically when you first ingest documents. You do not need to create it manually. --- ### Firecrawl Firecrawl is a **web scraping service** that converts web pages into clean markdown β ideal for ingesting online HR resources, policies, and articles into the knowledge base. | Variable | Required | Default | Description | | ------------------- | -------- | ------- | ---------------------- | | `FIRECRAWL_API_KEY` | Yes\* | β | Your Firecrawl API key | _\*Required only if you plan to use the web scraping feature._ **How it's used:** - **Single URL scraping**: From the admin dashboard, enter a URL and Firecrawl fetches the page content as markdown. - **Batch scraping**: Paste multiple URLs (one per line) to scrape several pages at once. - **Duplicate detection**: Before scraping, the system checks if a URL has already been ingested to prevent duplicates. - **Pipeline**: Scraped markdown is split into chunks (1000 characters, 200 overlap), embedded via OpenAI, and stored in Qdrant with `type: "webpage"` metadata. **Setting up Firecrawl:** 1. Sign up at [firecrawl.dev](https://www.firecrawl.dev) 2. Get your API key from the dashboard 3. Add it to your `.env` ```dotenv FIRECRAWL_API_KEY="fc-your-firecrawl-api-key" ``` --- ### LangSmith LangSmith provides **observability and tracing** for the entire LangChain pipeline. When enabled, every chain invocation β from query expansion to retrieval to generation β is logged and can be inspected in the LangSmith dashboard. | Variable | Required | Default | Description | | -------------------- | -------- | --------------------------------- | ----------------------------------- | | `LANGSMITH_TRACING` | No | `false` | Set to `true` to enable tracing | | `LANGSMITH_ENDPOINT` | No | `https://api.smith.langchain.com` | LangSmith API endpoint | | `LANGSMITH_API_KEY` | No | β | Your LangSmith API key | | `LANGSMITH_PROJECT` | No | `hr-intervals-chatbot` | Project name in LangSmith dashboard | **How it's used:** - When `LANGSMITH_TRACING=true`, LangChain **automatically** sends trace data to LangSmith for every chain execution. No code changes are needed β LangChain detects these environment variables at runtime. - Traces include: input queries, retrieved documents, prompt templates, LLM responses, latency, token usage, and errors. - Use the LangSmith dashboard to debug retrieval quality, monitor token costs, and identify slow chain steps. **Setting up LangSmith:** 1. Sign up at [smith.langchain.com](https://smith.langchain.com) 2. Create a new project (e.g., `hr-intervals-chatbot`) 3. Generate an API key 4. Add the variables to your `.env` ```dotenv LANGSMITH_TRACING=true LANGSMITH_ENDPOINT=https://api.smith.langchain.com LANGSMITH_API_KEY="lsv2_pt_your-langsmith-api-key" LANGSMITH_PROJECT=hr-intervals-chatbot ``` > **Tip:** Keep `LANGSMITH_TRACING=false` in production to avoid overhead, or enable it selectively for debugging. --- ### Complete `.env` Example ```dotenv # ββ OpenAI βββββββββββββββββββββββββββββββββββββββββββββββ OPENAI_API_KEY="sk-proj-your-openai-api-key" OPEN_AI_EMBEDDING_MODEL=text-embedding-3-small OPEN_AI_CHAT_MODEL=gpt-4o # ββ Qdrant βββββββββββββββββββββββββββββββββββββββββββββββ QDRANT_URL="https://your-cluster-id.region.cloud.qdrant.io:6333" QDRANT_API_KEY="your-qdrant-api-key" QDRANT_COLLECTION="hr-intervals" # ββ LangSmith (optional) ββββββββββββββββββββββββββββββββ LANGSMITH_TRACING=false LANGSMITH_ENDPOINT=https://api.smith.langchain.com LANGSMITH_API_KEY="lsv2_pt_your-langsmith-api-key" LANGSMITH_PROJECT=hr-intervals-chatbot # ββ Firecrawl βββββββββββββββββββββββββββββββββββββββββββ FIRECRAWL_API_KEY="fc-your-firecrawl-api-key" ``` --- ## Deployment Model This project uses a **split deployment** architecture: the user-facing chatbot runs in the cloud, while the admin dashboard runs on your local machine. ``` βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββ β Hugging Face Spaces (Cloud) β β Your Local Machine β β β β β β app.py (port 7860) β β admin.py (port 7861) β β - User chat interface β β - Upload documents β β - RAG Q&A β β - Scrape web pages β β - Public access β β - Delete documents β β β β - View knowledge base β ββββββββββββββ¬βββββββββββββββββββββ ββββββββββββ¬βββββββββββββββββββ β β ββββββββββββ¬ββββββββββββββββββββββββββββ β Both connect to the same βΌ cloud services βββββββββββββββββββ ββββββββββββββββ β Qdrant Cloud β β OpenAI API β βββββββββββββββββββ ββββββββββββββββ ``` | Component | Where it runs | Purpose | Access | | ---------- | ----------------------- | ------------------------- | ---------------------------- | | `app.py` | **Hugging Face Spaces** | User-facing chatbot | Public (anyone with the URL) | | `admin.py` | **Your local machine** | Knowledge base management | Private (admin only) | **Why this split?** - The **chatbot** (`app.py`) is deployed to Hugging Face Spaces so end users can access it 24/7 via a public URL without needing any local setup. - The **admin dashboard** (`admin.py`) runs locally because it performs sensitive operations (uploading documents, deleting data, scraping URLs). Keeping it local ensures only authorized administrators can modify the knowledge base. - Both components share the same `.env` configuration and connect to the same Qdrant and OpenAI instances, so changes made via the admin dashboard are immediately reflected in the chatbot. --- ## Running the Application ### User Chat Interface (Cloud) The chatbot is deployed on Hugging Face Spaces and accessible at: ``` https://pikamomo-hr-intervals-chatbot.hf.space ``` To deploy your own instance, see [Deploying to Hugging Face Spaces](#deploying-to-hugging-face-spaces) below. For local development and testing, you can also run it locally: ```bash python app.py ``` This opens the chatbot at **http://localhost:7860**. ### Admin Dashboard (Local) The admin panel runs on your local machine. Start it with: ```bash python admin.py ``` This opens the admin dashboard at **http://localhost:7861**. See [Admin Interface](#admin-interface) for details on each tab. > **Important:** The admin dashboard is intentionally **not deployed** to the cloud. Always run it locally to maintain control over who can modify the knowledge base. --- ## Admin Interface The admin dashboard (`admin.py`) provides tools to manage the knowledge base: | Tab | Description | | -------------------- | ----------------------------------------------------------------------------------- | | **View Documents** | Lists all ingested documents with metadata (source, type, upload date, chunk count) | | **Upload Documents** | Upload PDF or DOCX files. Choose a document type (policy, guide, article, etc.) | | **Scrape Web Pages** | Scrape a single URL or batch-scrape multiple URLs via Firecrawl | | **Delete Documents** | Remove documents from the vector store by source name | | **Help** | Usage instructions and tips | ### Ingesting Documents **Via file upload (PDF / DOCX):** 1. Open the admin dashboard 2. Go to the **Upload Documents** tab 3. Select your file and choose a document type 4. Click Upload β the file is parsed, chunked, embedded, and stored in Qdrant **Via web scraping:** 1. Go to the **Scrape Web Pages** tab 2. Enter a URL (or multiple URLs, one per line) 3. Click Scrape β Firecrawl fetches the page as markdown, which is then chunked and stored ### Document Metadata Each document chunk stored in Qdrant carries the following metadata: | Field | Type | Description | | ------------- | ------ | ------------------------------------------------------ | | `source` | string | Filename or URL | | `type` | string | `document`, `webpage`, `policy`, `guide`, or `article` | | `upload_date` | string | Ingestion date (`YYYY-MM-DD`) | | `page` | int | Page number (PDFs only) | | `valid_until` | string | Expiry date for time-sensitive policies (optional) | | `version` | string | Document version (optional) | --- ## How It Works ### RAG Pipeline 1. **Query expansion** β The user's question is passed through a `MultiQueryRetriever` that generates multiple rephrasings to improve recall. 2. **Embedding** β Each query variant is embedded using `text-embedding-3-small`. 3. **Retrieval** β Cosine similarity search against Qdrant returns the top 8 most relevant document chunks. 4. **Generation** β Retrieved chunks are injected into a prompt template alongside the conversation history, and GPT-4o generates a grounded answer. 5. **Citation** β The top 3 source documents are appended to the response. ### Chunking Strategy Documents are split using `RecursiveCharacterTextSplitter`: - **Chunk size:** 1,000 characters - **Overlap:** 200 characters - **Separators:** `["\n\n", "\n", ". ", " ", ""]` ### Session Management - Each user gets a unique session ID (UUID) - Conversation history is stored in memory per session - Sessions expire after **1 hour** of inactivity - History enables follow-up questions and contextual conversations ### PII Detection A regex-based check warns users when their query appears to contain names (e.g., `John Smith`). This is a first-line safeguard; integration with Microsoft Presidio is planned for more robust PII detection. --- ## Embedding the Chatbot Widget The file `chatbot-widget.html` provides a ready-to-use **floating chat widget** that you can embed on any website. It renders a circular button in the bottom-right corner that opens the chatbot in a popup window β no page navigation required. ### Quick Start The simplest way to add the chatbot to an existing web page is to copy three pieces from `chatbot-widget.html` into your site: 1. **CSS** (add to your `
` or stylesheet) 2. **HTML** (add before ``) 3. **JavaScript** (add before ``) ### Step 1 β Add the CSS Copy the widget styles into your page's `
` (or into your existing CSS file). These styles are marked between `CHATBOT WIDGET STYLES - COPY FROM HERE` and `END OF CHATBOT WIDGET STYLES` in the source file. ### Step 2 β Add the HTML Add the following HTML just before your closing `` tag. Update the `src` URL to point to your own Hugging Face Space: ```html
``` ### Step 3 β Add the JavaScript Add this script after the HTML above: ```html ``` ### Customization | What to change | Where | Details | | --------------- | -------------------------------------- | ---------------------------------------------------------------------- | | Chatbot URL | `