Spaces:

amoyakd
/

RAG-Based-Product-Inquiry-ChatBot

Sleeping

App Files Files Community

RAG-Based-Product-Inquiry-ChatBot / README.md

Yoma

specified python version

00acdae about 2 months ago

preview code

raw

history blame contribute delete

5.02 kB

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

metadata

title: Product Inquiry Chatbot for Electronic Store
emoji: 🛍️
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
python_version: 3.13
pinned: true

🛍️ Product Inquiry Chatbot (Hugging Face Spaces Version)

This project is a RAG (Retrieval-Augmented Generation) chatbot designed to answer questions about a product catalog and reviews, deployed on Hugging Face Spaces. It leverages the Hugging Face Inference API to power its language model capabilities.

The chatbot provides a user-friendly web interface where users can ask questions in natural language. The system intelligently retrieves relevant product information and customer reviews to generate accurate and context-aware answers.

IMPORTANT: The gradio-spaces-version branch is finetuned to be publishable on Hugging Faces Spaces while the main branch has code that works locally Ollama - hence, it has a different set of instructions.

The user interface, showing the chat history.

✨ Key Features

Cloud-Based RAG: The chatbot uses a Retrieval-Augmented Generation architecture. It searches a pre-built vector database of product specifications and customer reviews to find the most relevant information before answering a query.
Hugging Face Powered: All LLM interactions (moderation, query rewriting, and response generation) are handled by models hosted on the Hugging Face Inference API.
Intelligent Query Handling:
- Query Moderation: Input queries are checked for safety using a moderation model (Qwen/Qwen2.5-7B-Instruct).
- Contextual Rewriting: Follow-up questions are automatically rewritten to be self-contained, improving retrieval accuracy (e.g., "Tell me about its warranty" becomes "What is the warranty for the TechPro Ultrabook?").
Transparent & Debuggable: The interface includes an accordion that shows the exact documents the chatbot used to generate its response, which is useful for understanding the retrieval process. The user interface, showing the the retrieved documents used for generation.

🚀 Getting Started

The application is hosted on Hugging Face Spaces. You can access it directly via your web browser.

➡️ Live Demo: [Link to your Hugging Face Space]

No local installation is required. The vector database is pre-built and included in the repository, and all language models are accessed via remote API.

🧪 Testing Scripts

The test_scripts/ directory contains several Python scripts to test different parts of the RAG pipeline locally (requires cloning the repository and setting up a local environment).

test_chatbot_responses.py: Automates testing of chatbot responses against a set of predefined questions.
test_chunking.py: Tests the document chunking strategies defined in document_processor.py.
test_retrieval_evaluation.py: Evaluates the performance of the retrieval system based on user queries.
test_splitting.py: A script for testing different text splitting methods.
test_vector_db_population.py: Tests the population of the vector database.

📂 Project Structure & Key Files

Here is a breakdown of the most important files in the project:

app.py: The main entry point for the Hugging Face Spaces deployment. It imports and launches the Gradio application.
chatbot_app.py: Contains the Gradio UI code and orchestrates the entire query-response pipeline, from receiving user input to displaying the final answer.
llm_interface.py: Handles all communication with the Hugging Face Inference API. It is responsible for query moderation, rewriting, and final response generation using serverless models like meta-llama/Llama-3.2-3B-Instruct and Qwen/Qwen2.5-7B-Instruct.
retrieval_manager.py: The core of the retrieval system. It takes a rewritten query, performs a semantic search in the pre-built ChromaDB database, and returns the most relevant documents.
vector_db_manager.py: A utility script used to create the vector database from source JSON files. In the deployed version, this script is pre-run, and the database is included in the repository.
document_processor.py: Contains functions for loading, processing, and chunking source documents (from JSON files) before they are added to the vector database.
products.json & product_reviews.json: The raw data sources for the chatbot's knowledge base.
chroma_db/: The pre-built vector database containing embeddings for all products and reviews.
static/: Contains static assets for the Gradio UI, like user and bot avatar images.

Potential Improvements for Future

Fine tuning of metadata filters
Better routing logic rather than the simple deterministic one we have at the moment
Better context-awareness
Better extraction of categories and products from user query (either LLM-based or stemming to handle plurals)