File size: 5,022 Bytes
625e9e8
 
 
 
 
 
00acdae
625e9e8
00acdae
625e9e8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
title: Product Inquiry Chatbot for Electronic Store
emoji: 🛍️
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
python_version: 3.13
pinned: true
---


# 🛍️ Product Inquiry Chatbot (Hugging Face Spaces Version)

This project is a RAG (Retrieval-Augmented Generation) chatbot designed to answer questions about a product catalog and reviews, deployed on Hugging Face Spaces. It leverages the Hugging Face Inference API to power its language model capabilities.

The chatbot provides a user-friendly web interface where users can ask questions in natural language. The system intelligently retrieves relevant product information and customer reviews to generate accurate and context-aware answers.

IMPORTANT: The gradio-spaces-version branch is finetuned to be publishable on Hugging Faces Spaces while the 
main branch has code that works locally Ollama - hence, it has a different set of instructions. 


![Chatbot UI Screenshot](misc/chatbot_ui.png)
*The user interface, showing the chat history.*

## ✨ Key Features

- **Cloud-Based RAG**: The chatbot uses a Retrieval-Augmented Generation architecture. It searches a pre-built vector database of product specifications and customer reviews to find the most relevant information before answering a query.
- **Hugging Face Powered**: All LLM interactions (moderation, query rewriting, and response generation) are handled by models hosted on the Hugging Face Inference API.
- **Intelligent Query Handling**:
    - **Query Moderation**: Input queries are checked for safety using a moderation model (`Qwen/Qwen2.5-7B-Instruct`).
    - **Contextual Rewriting**: Follow-up questions are automatically rewritten to be self-contained, improving retrieval accuracy (e.g., "Tell me about its warranty" becomes "What is the warranty for the TechPro Ultrabook?").
- **Transparent & Debuggable**: The interface includes an accordion that shows the exact documents the chatbot used to generate its response, which is useful for understanding the retrieval process.
![Retrieved Documents Screenshot](misc/resource_docs.png)
*The user interface, showing the the retrieved documents used for generation.*


## 🚀 Getting Started

The application is hosted on Hugging Face Spaces. You can access it directly via your web browser.

**➡️ Live Demo: [Link to your Hugging Face Space]**

No local installation is required. The vector database is pre-built and included in the repository, and all language models are accessed via remote API.

## 🧪 Testing Scripts

The `test_scripts/` directory contains several Python scripts to test different parts of the RAG pipeline locally (requires cloning the repository and setting up a local environment).

-   `test_chatbot_responses.py`: Automates testing of chatbot responses against a set of predefined questions.
-   `test_chunking.py`: Tests the document chunking strategies defined in `document_processor.py`.
-   `test_retrieval_evaluation.py`: Evaluates the performance of the retrieval system based on user queries.
-   `test_splitting.py`: A script for testing different text splitting methods.
-   `test_vector_db_population.py`: Tests the population of the vector database.

## 📂 Project Structure & Key Files

Here is a breakdown of the most important files in the project:

-   `app.py`: The main entry point for the Hugging Face Spaces deployment. It imports and launches the Gradio application.
-   `chatbot_app.py`: Contains the Gradio UI code and orchestrates the entire query-response pipeline, from receiving user input to displaying the final answer.
-   `llm_interface.py`: Handles all communication with the Hugging Face Inference API. It is responsible for query moderation, rewriting, and final response generation using serverless models like `meta-llama/Llama-3.2-3B-Instruct` and `Qwen/Qwen2.5-7B-Instruct`.
-   `retrieval_manager.py`: The core of the retrieval system. It takes a rewritten query, performs a semantic search in the pre-built ChromaDB database, and returns the most relevant documents.
-   `vector_db_manager.py`: A utility script used to create the vector database from source JSON files. In the deployed version, this script is pre-run, and the database is included in the repository.
-   `document_processor.py`: Contains functions for loading, processing, and chunking source documents (from JSON files) before they are added to the vector database.
-   `products.json` & `product_reviews.json`: The raw data sources for the chatbot's knowledge base.
-   `chroma_db/`: The pre-built vector database containing embeddings for all products and reviews.
-   `static/`: Contains static assets for the Gradio UI, like user and bot avatar images.

# Potential Improvements for Future
1. Fine tuning of metadata filters
2. Better routing logic rather than the simple deterministic one we have at the moment
3. Better context-awareness
4. Better extraction of categories and products from user query (either LLM-based or stemming to handle plurals)