github-actions[bot]
Sync from https://github.com/samir72/chatassistant_retail
8b30412
---
title: chatassistant_retail
emoji: 🦄
colorFrom: indigo
colorTo: pink
sdk: gradio
short_description: Web based conversational AI chatbot designed specifically...
---
# chatassistant_retail
![PyPI version](https://img.shields.io/pypi/v/chatassistant_retail.svg)
[![Documentation Status](https://readthedocs.org/projects/chatassistant_retail/badge/?version=latest)](https://chatassistant_retail.readthedocs.io/en/latest/?version=latest)
**chatassistant_retail** is a production-ready conversational AI chatbot designed specifically for the retail industry, providing intelligent assistance for retail inventory management. It features a multi-modal interface (text + images) powered by Azure OpenAI GPT-4o-mini, hybrid RAG search with Azure Cognitive Search, stateful conversation management via LangGraph, and flexible session persistence (Memory/Redis/PostgreSQL). The system includes a Gradio-based web UI, MCP tool integration, and comprehensive observability with LangFuse.
* Python Version: >= 3.10 (tested on 3.10, 3.11, 3.12, and 3.13)
---
## Table of Contents
- [Features](#features)
- [Architecture](#architecture)
- [Project Structure](#project-structure)
- [Installation](#installation)
- [Usage](#usage)
- [Development](#development)
- [Scripts Reference](#scripts-reference)
- [Session Management](#session-management)
- [Multi-Modal Processing](#multi-modal-processing)
- [Observability](#observability)
- [Deployment Options](#deployment-options)
- [Testing](#testing)
- [Contributing](#contributing)
- [Credits](#credits)
---
## Features
### Core Capabilities
- **Conversational Interface**: Gradio-based web UI for natural language interactions with retail inventory systems
- **Retail Inventory Management**: Specialized chatbot for handling inventory queries, stock levels, and purchase orders
- **Natural Language Understanding**: Powered by Azure OpenAI (GPT-4o-mini) for understanding and responding to retail-related questions
- **Agentic Workflow**: LangGraph-based state machine for complex multi-step conversations
- **Tool Integration**: MCP (Model Context Protocol) server for inventory and purchase order tools
### Technical Features
- **Multi-Modal Input Processing**: Handle both text and images (PNG, JPG, JPEG, WebP) for product analysis and visual queries
- **Context Caching System**: Smart data reuse across conversation turns reduces redundant I/O operations and improves response times
- **Image-Based Product Lookup**: Upload product images for AI-powered identification, catalog matching, and automated inventory checking
- **Low-Stock Automation**: Automatic reorder recommendations when visually identified products are low in stock
- **LangGraph Orchestration**: Stateful conversation management with persistent session storage
- **Hybrid RAG Search**: Vector + keyword + semantic search via Azure Cognitive Search with automatic fallback to local data
- **Flexible Session Persistence**: Three backend options - Memory (fast, ephemeral), Redis (distributed), or PostgreSQL (full persistence)
- **Semantic Search**: AI-powered relevance ranking with Free tier support (1,000 queries/month)
- **Observability**: Built-in LangFuse integration for tracing, monitoring, and analytics
- **Async Processing**: Asynchronous operations for high-performance request handling
- **Graceful Fallbacks**: Automatic degradation when Azure services unavailable (local data, keyword search)
### Retail-Specific Features
- **Inventory Queries**: Check stock levels, product availability, and warehouse information
- **Purchase Order Management**: Create, track, and manage purchase orders
- **Sample Data Generation**: Generate realistic product catalogs and sales history using Faker (500+ products, 6 months sales)
- **Product Search**: Semantic search across product catalog using Azure Cognitive Search with visual product matching
### Deployment & Development
- **Deployment Flexibility**: Local development, HuggingFace Spaces, or production deployment (Azure App Service, Docker, K8s)
- **Development Tools**: Comprehensive test suite, data generation scripts, Azure Search setup automation
- **Multi-Environment Configuration**: Environment-based settings with validation and graceful fallbacks
---
## Architecture
The chatbot follows an agentic architecture pattern using LangGraph for state management and orchestration:
```
┌─────────────────────────────────────────────────────────┐
│ Gradio Web Interface │
│ ┌─────────────┐ ┌───────────────┐ │
│ │ Chat UI │ │ Session │ │
│ │ │ │ Management │ │
│ └─────────────┘ └───────────────┘ │
└────────────────────────┬────────────────────────────────┘
┌────────────────────────▼────────────────────────────────┐
│ LangGraph State Manager │
│ ┌──────────────────┐ ┌──────────────────────────┐ │
│ │ State Graph │ │ Session Store │ │
│ │ (Workflow) │ │ (Memory/PostgreSQL) │ │
│ └──────────────────┘ └──────────────────────────┘ │
└────────────────────────┬────────────────────────────────┘
┌────────────────────────▼────────────────────────────────┐
│ Business Logic Layer │
│ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐ │
│ │ Azure │ │ Inventory │ │ Purchase │ │
│ │ OpenAI │ │ Tools │ │ Order Tools │ │
│ └──────────────┘ └─────────────┘ └──────────────┘ │
└────────────────────────┬────────────────────────────────┘
┌────────────────────────▼────────────────────────────────┐
│ Data/Integration Layer │
│ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐ │
│ │ PostgreSQL │ │ Azure │ │ Redis │ │
│ │ (Sessions) │ │ Search │ │ (Cache) │ │
│ └──────────────┘ └─────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────┘
┌────────────────────────▼────────────────────────────────┐
│ Observability Layer │
│ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐ │
│ │ LangFuse │ │ Metrics │ │ Python │ │
│ │ Tracing │ │ Collector │ │ Logging │ │
│ └──────────────┘ └─────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────┘
```
### Key Components
- **Gradio UI** (`ui/`): Web-based chat interface with multi-modal input (text + images)
- **LangGraph State Manager** (`state/`): Conversation state management with Memory/Redis/PostgreSQL session stores
- **Workflow Orchestration** (`workflow/`): Image-based product lookup with multi-step automation (vision → search → inventory → recommendations)
- **Context Utilities** (`tools/context_utils.py`): Smart caching layer for performance optimization and data coherence
- **Azure OpenAI Client** (`llm/`): Multi-modal LLM integration (GPT-4o-mini) with prompt templates and response parsing
- **Context-Aware Tools** (`tools/`): Inventory and purchase order tools with optional state parameter for intelligent cache reuse
- **RAG System** (`rag/`): Hybrid search with Azure Cognitive Search (vector + keyword + semantic) and local fallback
- **Observability** (`observability/`): LangFuse tracing and metrics collection across all components
- **Data Models** (`data/`): Pydantic models for products, sales, and purchase orders with sample data generation
### Design Principles
- **Separation of Concerns**: Clear separation between UI, orchestration, business logic, and data layers
- **Stateful Conversations**: LangGraph manages conversation state with checkpointing
- **Tool-Based Architecture**: LLM invokes tools (inventory queries, purchase orders) through structured outputs
- **Observable by Default**: All LLM calls and tool invocations traced with LangFuse
- **Error Resilience**: Graceful degradation and comprehensive error handling
---
## Project Structure
This project uses the **src-layout** pattern for better development and testing practices:
```
chatassistant_retail/
├── app.py # HuggingFace Spaces entry point
├── src/
│ └── chatassistant_retail/
│ ├── __init__.py # Package initialization
│ ├── __main__.py # Application entry point
│ ├── cli.py # CLI entry point
│ ├── chatbot.py # Main chatbot orchestrator (multi-modal)
│ │
│ ├── ui/ # Gradio web interface
│ │ ├── __init__.py
│ │ ├── gradio_app.py # Main Gradio application
│ │ ├── chat_interface.py # Chat UI components
│ │ └── metrics_dashboard.py # Observability dashboard (UI currently disabled)
│ │
│ ├── state/ # LangGraph state management
│ │ ├── __init__.py
│ │ ├── langgraph_manager.py # State graph orchestration
│ │ ├── session_store.py # Abstract session interface
│ │ ├── memory_store.py # In-memory store (HF Spaces)
│ │ ├── redis_store.py # Redis store (distributed)
│ │ └── postgresql_store.py # PostgreSQL store (persistent)
│ │
│ ├── llm/ # LLM integration
│ │ ├── __init__.py
│ │ ├── azure_openai_client.py # Azure OpenAI client (GPT-4o-mini)
│ │ ├── prompt_templates.py # System/user prompts
│ │ └── response_parser.py # Response parsing
│ │
│ ├── workflow/ # NEW: Workflow orchestration
│ │ ├── __init__.py
│ │ └── image_processor.py # Image-based product lookup
│ │
│ ├── tools/ # Inventory & PO tools
│ │ ├── __init__.py
│ │ ├── context_utils.py # NEW: Context caching utilities
│ │ ├── inventory_tools.py # UPDATED: Context-aware inventory operations
│ │ ├── purchase_order_tools.py # UPDATED: Context-aware PO operations
│ │ └── mcp_server.py # UPDATED: MCP server with state passing
│ │
│ ├── rag/ # Azure Cognitive Search RAG
│ │ ├── __init__.py
│ │ ├── azure_search_client.py # Hybrid search client (vector+keyword+semantic)
│ │ ├── retriever.py # Document retrieval with fallback
│ │ └── embeddings.py # Embedding generation
│ │
│ ├── data/ # Data models and generation
│ │ ├── __init__.py
│ │ ├── models.py # Product, Sale, PurchaseOrder models
│ │ └── generator.py # Sample data generator (Faker)
│ │
│ ├── observability/ # LangFuse observability
│ │ ├── __init__.py
│ │ ├── langfuse_client.py # LangFuse wrapper
│ │ ├── decorators.py # @trace decorator
│ │ └── metrics_collector.py # Metrics aggregation
│ │
│ └── config/ # Configuration
│ ├── __init__.py
│ ├── settings.py # Pydantic settings (env-based)
│ └── deployment.py # Deployment configs
├── data/ # Sample data files
│ ├── products.json # 500+ sample products (216KB)
│ ├── sales_history.json # 6 months sales data (3.5MB)
│ └── purchase_orders.json # Sample purchase orders
├── scripts/ # Utility scripts
│ ├── setup_azure_search.py # Azure Search index setup
│ ├── generate_sample_data.py # Generate sample product/sales data
│ ├── test_gradio_ui.py # UI testing script
│ ├── test_phase2.py # Integration testing
│ └── test_phase3.py # E2E scenario testing
├── tests/
│ ├── __init__.py
│ ├── unit/ # Unit tests
│ │ ├── test_context_utils.py # NEW: Context caching tests
│ │ ├── test_image_processor.py # NEW: Image workflow tests
│ │ ├── test_observability.py
│ │ ├── test_inventory_tools.py
│ │ ├── test_session_store.py
│ │ ├── test_retriever.py
│ │ ├── test_azure_search_client.py
│ │ ├── test_azure_openai_client.py
│ │ ├── test_mcp_server.py
│ │ └── test_data_generator.py
│ ├── integration/ # Integration tests
│ │ ├── test_tool_context_integration.py # NEW: Context-aware tool tests
│ │ └── test_state_manager.py
│ └── test_chatassistant_retail.py # Main tests
├── docs/ # Sphinx documentation
│ ├── conf.py
│ ├── index.rst
│ └── usage.rst
├── .github/
│ └── workflows/
│ └── test.yml # CI/CD pipeline
├── pyproject.toml # Project metadata and dependencies
├── justfile # Task automation
├── CLAUDE.md # Claude Code guidance
├── README.md # This file
├── HISTORY.md # Changelog
└── LICENSE # MIT License
```
### Key Directories
- **app.py**: HuggingFace Spaces entry point (sets deployment mode and launches Gradio on 0.0.0.0:7860)
- **ui/**: Gradio-based web interface with multi-modal chat
- **state/**: LangGraph state machine with three session backends (Memory/Redis/PostgreSQL)
- **llm/**: Azure OpenAI integration (GPT-4o-mini) with prompt engineering and multi-modal support
- **tools/**: Inventory and purchase order tools with MCP server integration
- **rag/**: Hybrid search with Azure Cognitive Search (vector+keyword+semantic) and local fallback
- **data/**: Pydantic data models and Faker-based sample data generation
- **observability/**: LangFuse tracing, metrics collection, and monitoring
- **config/**: Pydantic settings with environment variable support and validation
- **scripts/**: Setup scripts (Azure Search index, data generation, testing)
- **data/ (root)**: Sample JSON files (products, sales history, purchase orders)
- **tests/**: Comprehensive PyTest suite (unit and integration tests)
---
## Installation
### Prerequisites
- Python >= 3.10
- [uv](https://github.com/astral-sh/uv) (Rust-based Python package manager) - **Required**
- [just](https://github.com/casey/just) (Command runner for task automation) - **Required for development**
- Azure OpenAI API access
- Azure Cognitive Search instance (optional, for RAG)
- PostgreSQL database (optional, for persistent sessions)
- Redis instance (optional, for caching)
- LangFuse account (optional, for observability)
#### Installing uv and just
Both `uv` and `just` need to be installed system-wide (not in a virtual environment):
**macOS:**
```bash
# Install with Homebrew (recommended)
brew install uv just
# Or install uv via curl
curl -LsSf https://astral.sh/uv/install.sh | sh
# And install just separately
brew install just
```
**Linux:**
```bash
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install just
curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash
```
**Windows:**
```powershell
# Install uv
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# Install just with cargo
cargo install just
```
After installation, verify both are available:
```bash
uv --version # Should show: uv 0.9.x or later
just --version # Should show: just 1.x.x or later
```
### Install from PyPI (when published)
```bash
pip install chatassistant_retail
```
### Development Installation
1. Clone the repository:
```bash
git clone https://github.com/samir72/chatassistant_retail.git
cd chatassistant_retail
```
2. Install dependencies with uv:
```bash
uv sync
```
3. Set up environment variables (create `.env` file):
```bash
# Azure OpenAI (Required)
AZURE_OPENAI_API_KEY=your-api-key
AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini # GPT-4o-mini deployment
AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-ada-002 # For RAG embeddings
AZURE_OPENAI_API_VERSION=2024-02-15-preview
# Azure Cognitive Search (Optional - fallback to local data if not configured)
AZURE_COGNITIVE_SEARCH_ENDPOINT=https://your-search.search.windows.net
AZURE_COGNITIVE_SEARCH_API_KEY=your-search-key
AZURE_SEARCH_INDEX_NAME=products
# Note: Semantic search must be enabled in Azure Portal (Search Service → Semantic ranker → Free)
# Session Persistence (Optional - defaults to Memory store)
SESSION_STORE_TYPE=memory # Options: memory, redis, postgresql
REDIS_URL=redis://localhost:6379/0 # If using redis
POSTGRES_CONNECTION_STRING=postgresql://user:password@localhost:5432/chatbot # If using postgresql
# Deployment Configuration (Optional)
DEPLOYMENT_MODE=local # Options: local, hf_spaces
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR
# LangFuse Observability (Optional - recommended for production)
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com
LANGFUSE_ENABLED=true
```
4. Verify installation:
```bash
just test
```
---
## Sample Data
The repository includes **sample data files** for immediate testing and development:
| File | Size | Description |
|------|------|-------------|
| `data/products.json` | 220 KB | 500+ sample products across 8 retail categories (Electronics, Clothing, Groceries, etc.) |
| `data/sales_history.json` | 3.6 MB | 6 months of sales transaction history with seasonal patterns |
| `data/purchase_orders.json` | 1.3 KB | Sample purchase orders (pending/fulfilled statuses) |
**Total size:** ~3.7 MB
### Why Sample Data is Included
Previously, the `data/` directory was excluded from version control to keep the repository lean. As of version 0.1.1, sample data is **included by default** for:
- ✅ Faster setup for new developers
- ✅ Consistent test data across environments
- ✅ No need to run data generation scripts initially
### Regenerating Sample Data (Optional)
If you want to regenerate or customize sample data:
```bash
python scripts/generate_sample_data.py
```
This will create fresh sample data with configurable parameters.
---
## Usage
### Setting Up Azure Cognitive Search
If you're using Azure Cognitive Search for RAG (Retrieval-Augmented Generation), you need to create the search index before running the chatbot.
#### Prerequisites
- Azure Cognitive Search service created
- Environment variables configured:
- `AZURE_COGNITIVE_SEARCH_ENDPOINT`
- `AZURE_COGNITIVE_SEARCH_API_KEY`
- `AZURE_SEARCH_INDEX_NAME` (optional, defaults to "products")
#### Create the Index
Run the setup script:
```bash
python scripts/setup_azure_search.py
```
This will:
1. Create the "products" index with the proper schema
2. Configure vector search (1536-dimensional embeddings)
3. Set up semantic search capabilities
4. Optionally load 500 sample products with embeddings
#### Verify Setup
Check index health:
```python
from chatassistant_retail.rag import AzureSearchClient
from chatassistant_retail.config import get_settings
client = AzureSearchClient(get_settings())
health = client.check_index_health()
print(f"Status: {health['overall_status']}")
print(f"Document count: {health['stats']['document_count']}")
```
**Note:** If the index doesn't exist, the chatbot will still work but will fall back to local product data without vector search capabilities.
---
### Running the Chatbot
#### Start Gradio Web Interface
```bash
# Using Python module
python -m chatassistant_retail
# Or using the CLI entry point
chatassistant-retail
# With custom port
chatassistant-retail --port 7860
```
The web interface will be available at `http://localhost:7860` with:
- **Chat Interface**: Main conversational UI with multi-modal support (text + images)
#### Command-Line Usage
```bash
# Interactive CLI mode
chatassistant-retail --cli
# Single query
chatassistant-retail --query "What is the stock level for SKU-12345?"
```
### Example Conversations
#### Inventory Query
```
User: What is the current stock level for SKU-12345?
Assistant: Let me check the inventory for SKU-12345...
[Tool: inventory_tools.check_stock_level(sku="SKU-12345")]
The current stock level for SKU-12345 is 150 units across 3 warehouses.
```
#### Purchase Order Management
```
User: Create a purchase order for 500 units of SKU-67890 from Supplier ABC
Assistant: I'll create a purchase order for you...
[Tool: purchase_order_tools.create_po(sku="SKU-67890", quantity=500, supplier="ABC")]
Purchase order PO-2025-001 created successfully for 500 units of SKU-67890.
Expected delivery: 2025-01-15
```
#### Product Search with RAG
```
User: Tell me about our winter jacket inventory
Assistant: Let me search our product catalog...
[RAG: Azure Cognitive Search retrieval with semantic search]
We have 5 winter jacket styles in stock with a total of 1,250 units.
Most popular: "Alpine Puffer Jacket" (SKU-WJ-001) with 450 units available.
```
#### Multi-Modal Query (Image + Text)
```
User: [Uploads product image] Is this the same as SKU-12345? What's the stock level?
Assistant: Analyzing the product image...
[Multi-Modal: GPT-4o-mini processing]
[Tool: inventory_tools.check_stock_level(sku="SKU-12345")]
Yes, this appears to be the same product as SKU-12345 (Blue Athletic Sneaker).
Current stock: 85 units available across 2 warehouses.
```
#### Image-Based Product Workflow (NEW)
```
User: [Uploads product image] "Check inventory for this item"
Assistant: Let me analyze the product image...
[ImageProductProcessor workflow]
- Vision Extraction: Identifies "Wireless Optical Mouse" (Electronics, confidence: 0.89)
- Catalog Search: Finds 2 matching products via RAG
- Inventory Check:
* SKU-10001: 50 units (adequate stock)
* SKU-10002: 8 units (below reorder level of 20)
- Reorder Recommendation: Order 50 units of SKU-10002 (HIGH urgency, 5 days until stockout)
Response:
"🔍 Product Identification Results
I identified: Wireless Optical Mouse
Category: Electronics
📦 Matching Products in Inventory:
1. Wireless Optical Mouse (SKU-10001)
- Price: $24.99
- Current Stock: 50 units
- Reorder Level: 10 units
- Status: OK
- Supplier: TechSupplies Inc
2. Ergonomic Wireless Mouse (SKU-10002)
- Price: $34.99
- Current Stock: 8 units
- Reorder Level: 20 units
- Status: LOW STOCK
- Supplier: ComfortTech
💡 Recommendations:
⚠️ Ergonomic Wireless Mouse (SKU-10002) is running low:
- Days until stockout: 5
- Suggested order quantity: 50 units
- Urgency: HIGH
Would you like me to create a purchase order?"
```
---
## Development
This project uses `just` for task automation. All development commands are defined in the `justfile`.
### Quick Start
```bash
# List all available commands
just list
# Run full QA pipeline (format, lint, type-check, test)
just qa
# Run tests
just test
# Run tests with debugger on failure
just pdb
# Generate coverage report
just coverage
```
### Development Workflow
1. **Make changes** to the codebase
2. **Run QA**: `just qa` (formats, lints, type-checks, and tests)
3. **Debug failures**: `just pdb` if tests fail
4. **Check coverage**: `just coverage` to ensure adequate test coverage
5. **Build package**: `just build` when ready to release
### Code Quality Tools
- **Ruff**: Fast Python linter and formatter (line length: 120)
- **isort**: Import sorting (integrated with Ruff)
- **ty**: Type checking with all rules enabled
- **pytest**: Testing framework with async support and coverage reporting
### Local Development Setup
```bash
# Activate virtual environment (if not using uv)
source .venv/bin/activate
# Install pre-commit hooks (optional)
pre-commit install
# Run development server with auto-reload
python -m chatassistant_retail --reload
# Run tests in watch mode
pytest-watch
```
### Data Generation
The project includes tools for generating realistic sample data for development and testing.
#### Generating Sample Data
Use the `generate_sample_data.py` script to create sample products and sales history:
```bash
# Generate default data (500 products, 6 months sales history)
python scripts/generate_sample_data.py
# Custom data generation
python scripts/generate_sample_data.py --count 1000 --months 12
```
#### What Gets Generated
**Products (`data/products.json`):**
- 500+ realistic retail products (configurable)
- Categories: Electronics, Clothing, Home & Garden, Sports, Books
- Realistic pricing, descriptions, and metadata
- Auto-generated embeddings for Azure Search (1536 dimensions)
**Sales History (`data/sales_history.json`):**
- 6 months of transactional data (configurable)
- Realistic sales patterns (seasonality, trends)
- Multiple warehouses and channels
- Customer demographics
**Purchase Orders (`data/purchase_orders.json`):**
- Sample PO data for testing
- Various suppliers and statuses
- Delivery tracking information
#### Integration with Azure Search
The generated data includes pre-computed embeddings for immediate upload to Azure Search:
```bash
# 1. Generate sample data with embeddings
python scripts/generate_sample_data.py
# 2. Upload to Azure Search
python scripts/setup_azure_search.py --load-data
```
#### Customization
The data generator uses [Faker](https://faker.readthedocs.io/) for realistic data generation. Customize by modifying `src/chatassistant_retail/data/generator.py`.
---
## Scripts Reference
The `scripts/` directory contains utility scripts for setup, testing, and data management.
### setup_azure_search.py
**Purpose:** Create and configure Azure Cognitive Search index with proper schema for RAG.
**Usage:**
```bash
# Create index (prompts to load sample data)
python scripts/setup_azure_search.py
# Create index and auto-load data
python scripts/setup_azure_search.py --load-data
# Recreate existing index
python scripts/setup_azure_search.py --recreate
```
**What It Does:**
- Creates "products" index with hybrid search configuration
- Configures vector search (HNSW algorithm, 1536 dimensions)
- Sets up semantic search capabilities
- Optionally loads 500 sample products with embeddings
- Uploads in batches of 100 for efficiency
- Performs health check and displays statistics
**Requirements:**
- `AZURE_COGNITIVE_SEARCH_ENDPOINT` environment variable
- `AZURE_COGNITIVE_SEARCH_API_KEY` environment variable
- `AZURE_OPENAI_EMBEDDING_DEPLOYMENT` for embedding generation
### generate_sample_data.py
**Purpose:** Generate realistic retail data for development and testing.
**Usage:**
```bash
# Default: 500 products, 6 months sales
python scripts/generate_sample_data.py
# Custom counts
python scripts/generate_sample_data.py --count 1000 --months 12
# Dry run (don't save files)
python scripts/generate_sample_data.py --dry-run
```
**Output:**
- `data/products.json` - Product catalog with embeddings
- `data/sales_history.json` - Sales transactions
- `data/purchase_orders.json` - PO data
### test_gradio_ui.py
**Purpose:** Interactive testing of Gradio UI components.
**Usage:**
```bash
python scripts/test_gradio_ui.py
```
Launches the Gradio interface for manual testing and validation.
### test_phase2.py / test_phase3.py
**Purpose:** Integration and end-to-end testing scripts.
**Usage:**
```bash
# Integration testing
python scripts/test_phase2.py
# E2E scenario testing
python scripts/test_phase3.py
```
Tests the complete chatbot workflow including LangGraph state management, tool execution, and RAG retrieval.
---
## Session Management
The chatbot supports three different session storage backends for conversation state persistence, allowing you to choose the right balance between simplicity, performance, and durability.
### Session Store Backends
#### Memory Store (Default)
The in-memory session store is the default and simplest option, ideal for development and HuggingFace Spaces deployment.
**Characteristics:**
-**Fast**: No network latency, instant access
-**Simple**: No external dependencies required
-**Auto-configured**: Default for HF Spaces deployment
-**Ephemeral**: Sessions lost on restart
-**Single-instance**: Not shared across multiple app instances
**Configuration:**
```bash
SESSION_STORE_TYPE=memory # or omit (default)
```
**Use When:**
- Developing locally
- Deploying to HuggingFace Spaces
- Session persistence not critical
- Running single instance
#### Redis Store
Redis provides distributed session storage with persistence, ideal for production deployments with multiple instances.
**Characteristics:**
-**Fast**: In-memory with disk persistence
-**Distributed**: Shared across multiple app instances
-**Persistent**: Survives app restarts (with RDB/AOF)
-**TTL Support**: Automatic session expiration
- ⚠️ **Requires Redis**: External service dependency
**Configuration:**
```bash
SESSION_STORE_TYPE=redis
REDIS_URL=redis://localhost:6379/0
# Or for production with auth:
REDIS_URL=redis://:password@redis-host:6379/0
```
**Use When:**
- Running multiple app instances (load balanced)
- Need distributed session sharing
- Want automatic session expiration
- Production deployment with high availability
#### PostgreSQL Store
PostgreSQL provides full persistence with queryable session history, ideal for audit requirements and analytics.
**Characteristics:**
-**Fully Persistent**: Durable storage with ACID guarantees
-**Queryable**: SQL access to session data and history
-**Audit Trail**: Complete conversation history
-**Backup/Recovery**: Standard database backup tools
- ⚠️ **Slower**: Disk I/O overhead vs in-memory stores
- ⚠️ **Requires PostgreSQL**: External database dependency
**Configuration:**
```bash
SESSION_STORE_TYPE=postgresql
POSTGRES_CONNECTION_STRING=postgresql://user:password@localhost:5432/chatbot
```
**Use When:**
- Need complete audit trail
- Compliance/regulatory requirements
- Want to query conversation history
- Long-term session retention needed
- Analytics on conversation patterns
### Configuration Examples
**Local Development (Memory):**
```bash
# .env
SESSION_STORE_TYPE=memory # Fast, simple, ephemeral
```
**HuggingFace Spaces (Memory):**
```bash
# Automatically configured via app.py
DEPLOYMENT_MODE=hf_spaces
SESSION_STORE_TYPE=memory # Default for HF Spaces
```
**Production (Redis):**
```bash
# .env
SESSION_STORE_TYPE=redis
REDIS_URL=redis://:your-password@redis.example.com:6379/0
```
**Enterprise (PostgreSQL):**
```bash
# .env
SESSION_STORE_TYPE=postgresql
POSTGRES_CONNECTION_STRING=postgresql://chatbot:password@db.example.com:5432/chatbot
```
### Choosing a Backend
| Criteria | Memory | Redis | PostgreSQL |
|----------|--------|-------|------------|
| **Speed** | ⭐⭐⭐ Fastest | ⭐⭐ Very Fast | ⭐ Fast |
| **Persistence** | ❌ None | ⭐⭐ Configurable | ⭐⭐⭐ Full |
| **Multi-Instance** | ❌ No | ✅ Yes | ✅ Yes |
| **Setup Complexity** | ⭐⭐⭐ None | ⭐⭐ Moderate | ⭐ Complex |
| **Cost** | Free | $ Low | $$ Moderate |
| **Best For** | Dev, HF Spaces | Production | Enterprise, Audit |
---
## Multi-Modal Processing
The chatbot supports multi-modal input, allowing users to send both text and images for visual product analysis, comparison, and identification.
### Overview
Powered by Azure OpenAI GPT-4o-mini, the chatbot can:
- Analyze product images to identify items
- Compare products visually against catalog images
- Extract product details from photos (color, style, features)
- Verify product authenticity and condition
- Assist with visual inventory checks
### Supported Formats
**Image Formats:**
- PNG (.png)
- JPEG (.jpg, .jpeg)
- WebP (.webp)
**Size Limits:**
- Maximum file size: 20MB (Azure OpenAI limit)
- Recommended resolution: 2048x2048 pixels or less
- Images automatically resized if too large
### Usage Examples
#### Text-Only Query
```python
# Via Gradio UI: Type in chat box
User: "What is the stock level for SKU-12345?"
```
#### Image + Text Query
```python
# Via Gradio UI: Click image upload button, select image, then type query
User: [Uploads product photo] "Is this product in our catalog? Check inventory?"
```
Assistant analyzes the image using GPT-4o-mini, compares it against the catalog, and provides inventory information.
#### Product Image Analysis
```python
User: [Uploads warehouse photo showing multiple items] "Identify all products in this image and check stock levels"
```
The chatbot can identify multiple products in a single image and provide bulk inventory information.
### Best Practices
**Image Quality:**
- Use clear, well-lit photos
- Ensure products are centered and in focus
- Avoid excessive image compression
**Query Construction:**
- Combine images with specific questions for best results
- Reference SKUs or product names when known
- Ask focused questions (inventory, identification, comparison)
**Supported Use Cases:**
- ✅ Product identification from photos
- ✅ Visual comparison against catalog
- ✅ Quality/authenticity verification
- ✅ Bulk identification from warehouse photos
- ❌ Image generation or editing (not supported)
---
## Observability
The chatbot includes comprehensive observability using **LangFuse** for distributed tracing and monitoring.
### LangFuse Integration
LangFuse is integrated throughout the application for automatic tracing of:
- **LLM Calls**: All Azure OpenAI requests with prompts, completions, and token usage
- **Tool Invocations**: Inventory queries and purchase order operations
- **State Transitions**: LangGraph state changes and workflow steps
- **RAG Operations**: Document retrieval and embedding generation
### Configuration
Set up LangFuse in your `.env` file:
```bash
LANGFUSE_PUBLIC_KEY=pk-lf-your-public-key
LANGFUSE_SECRET_KEY=sk-lf-your-secret-key
LANGFUSE_HOST=https://cloud.langfuse.com # or self-hosted
LANGFUSE_ENABLED=true
```
### Using the @trace Decorator
Automatically trace any function:
```python
from chatassistant_retail.observability import trace
@trace(name="inventory_check", trace_type="tool")
async def check_inventory(sku: str):
# Function automatically traced in LangFuse
return await inventory_service.get_stock(sku)
```
### Manual Tracing (Advanced Usage)
**Note:** For most use cases, the `@trace` decorator is recommended. Use manual tracing only when you need fine-grained control over span lifecycle.
```python
from chatassistant_retail.observability import create_span, log_event
# Create a span (must call .end() when done)
span = create_span(
name="complex_workflow",
input_data={"query": "user input"},
metadata={"user_id": "123"}
)
try:
# Do work
result = perform_operation()
# Log events within the span
log_event(
name="operation_milestone",
level="INFO",
input_data={"checkpoint": "halfway"}
)
# Update span with output
span.update(output={"result": result})
finally:
# Always end the span
span.end()
```
**Important:** Spans created with `create_span()` must be explicitly ended with `.end()` to avoid memory leaks. Use the `@trace` decorator for automatic lifecycle management.
### Metrics Dashboard
**Status:** The Gradio UI metrics dashboard is currently disabled.
**Access Metrics:**
- **LangFuse Web Dashboard:** https://cloud.langfuse.com (recommended for production monitoring)
- **Programmatic Access:** Use the `MetricsCollector` class directly (see Metrics Collection section below)
All observability infrastructure remains active. The `MetricsCollector` class continues to aggregate data from LangFuse traces for programmatic access.
### Metrics Collection
The `MetricsCollector` class aggregates data from LangFuse traces:
```python
from chatassistant_retail.observability import MetricsCollector
collector = MetricsCollector()
metrics = collector.get_dashboard_data()
print(f"Total queries: {metrics['total_queries']}")
print(f"Avg response time: {metrics['avg_response_time']:.2f}s")
print(f"Success rate: {metrics['success_rate']:.1f}%")
```
### Logging
Structured logging with Python's `logging` module:
```python
import logging
logger = logging.getLogger(__name__)
logger.info("Processing user query", extra={
"session_id": session_id,
"query_length": len(query)
})
```
Log levels are configurable via environment variables:
```bash
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL
```
### Monitoring in Production
LangFuse provides:
- **Request Tracing**: End-to-end visibility of each conversation
- **Performance Metrics**: Latency, throughput, and error rates
- **Cost Tracking**: Token usage and API costs per request
- **User Analytics**: Session duration, query patterns, tool usage
- **Error Analysis**: Exception tracking and debugging
Access your LangFuse dashboard at https://cloud.langfuse.com to view:
- Real-time trace explorer
- Analytics dashboards
- Cost reports
- User session replays
---
## Deployment Options
The chatbot supports multiple deployment scenarios, from local development to production hosting on cloud platforms.
### Local Development
**Quick Start:**
```bash
# Clone repository
git clone https://github.com/samir72/chatassistant_retail.git
cd chatassistant_retail
# Install dependencies
uv sync
# Set up environment variables
cp .env.example .env
# Edit .env with your Azure credentials
# Run locally
python -m chatassistant_retail
```
**Features:**
- Hot reload with `--reload` flag
- Full debugging capabilities
- All features enabled (Azure Search, Redis, PostgreSQL, LangFuse)
- Access at `http://localhost:7860`
**Session Storage:** Any (Memory, Redis, PostgreSQL)
### HuggingFace Spaces
Deploy directly to HuggingFace Spaces for free hosting with automatic HTTPS and sharing.
#### Prerequisites
1. HuggingFace account (https://huggingface.co)
2. Azure OpenAI API credentials
3. (Optional) Azure Cognitive Search for RAG
#### Configuration
The `app.py` file is pre-configured for HF Spaces deployment:
```python
# app.py sets deployment mode automatically
os.environ["DEPLOYMENT_MODE"] = "hf_spaces"
```
**Important: src-layout Workaround**
Due to HuggingFace Spaces' Docker build process, this project uses a **sys.path manipulation workaround** instead of standard package installation:
- `requirements.txt` installs only dependencies (no package self-installation via `.`)
- `app.py` adds `src/` directory to Python path at startup
- Imports work without formal package installation
This is intentional and necessary because HF Spaces' auto-generated Dockerfile mounts `requirements.txt` before copying `pyproject.toml`, preventing standard `pip install .` from working. For local development, continue using `pip install -e .` as normal.
**Environment Variables (HF Spaces Secrets):**
```bash
# Required
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=...
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini
# Recommended
AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-ada-002
AZURE_COGNITIVE_SEARCH_ENDPOINT=...
AZURE_COGNITIVE_SEARCH_API_KEY=...
# Optional
LANGFUSE_PUBLIC_KEY=...
LANGFUSE_SECRET_KEY=...
```
#### Deployment Steps
1. **Create Space:**
- Go to https://huggingface.co/new-space
- Select "Gradio" as SDK
- Choose "Public" or "Private"
2. **Upload Files:**
```bash
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME
cp -r chatassistant_retail/* .
git add .
git commit -m "Initial deployment"
git push
```
3. **Set Secrets:**
- Go to Space Settings → Repository secrets
- Add all required environment variables
- Space will automatically rebuild and deploy
4. **Access:**
- Your app will be live at: `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`
**Limitations:**
- Uses Memory session store (sessions lost on restart)
- No Redis or PostgreSQL (Spaces compute limitations)
- Limited to 16GB RAM, 8 CPU cores (Free tier)
- Automatic sleep after 48 hours of inactivity
**Best For:**
- Demos and prototypes
- Sharing with stakeholders
- Testing without infrastructure setup
- Free tier hosting
### Production Deployment
Deploy to cloud platforms for scalable, production-grade hosting.
#### Azure App Service
**Requirements:**
- Azure subscription
- Azure App Service (B1 or higher)
- Azure OpenAI, Azure Cognitive Search, Azure Database for PostgreSQL
**Steps:**
```bash
# 1. Install Azure CLI
az login
# 2. Create App Service
az webapp create \
--resource-group retail-chatbot-rg \
--plan retail-chatbot-plan \
--name retail-chatbot-app \
--runtime "PYTHON:3.12"
# 3. Configure environment variables
az webapp config appsettings set \
--resource-group retail-chatbot-rg \
--name retail-chatbot-app \
--settings \
DEPLOYMENT_MODE=production \
SESSION_STORE_TYPE=postgresql \
POSTGRES_CONNECTION_STRING="..." \
AZURE_OPENAI_API_KEY="..."
# 4. Deploy
az webapp up \
--resource-group retail-chatbot-rg \
--name retail-chatbot-app
```
**Recommended Configuration:**
- **Compute:** App Service B2 or higher (3.5GB RAM)
- **Session Store:** Azure Database for PostgreSQL
- **Cache:** Azure Cache for Redis (optional)
- **Monitoring:** Azure Application Insights + LangFuse
#### Docker Deployment
**Dockerfile Example:**
```dockerfile
FROM python:3.12-slim
WORKDIR /app
# Install uv
RUN pip install uv
# Copy project files
COPY . /app
# Install dependencies
RUN uv sync
# Expose port
EXPOSE 7860
# Run application
CMD ["python", "-m", "chatassistant_retail"]
```
**Build and Run:**
```bash
# Build image
docker build -t retail-chatbot .
# Run container
docker run -p 7860:7860 \
-e AZURE_OPENAI_API_KEY="..." \
-e DEPLOYMENT_MODE=production \
retail-chatbot
```
#### Kubernetes Deployment
For high-availability, multi-replica deployments:
**Key Considerations:**
- Use PostgreSQL or Redis for session storage (not Memory)
- Configure horizontal pod autoscaling (HPA)
- Set up ingress with TLS/SSL
- Use Azure Key Vault for secrets management
- Configure health checks and liveness probes
**Session Store:** Redis or PostgreSQL (required for multi-replica)
### Architecture Recommendations
| Deployment | Compute | Session Store | Cost | Best For |
|------------|---------|---------------|------|----------|
| **Local** | Developer machine | Memory | Free | Development |
| **HF Spaces** | Free tier (16GB) | Memory | Free | Demos, prototypes |
| **Azure App Service** | B2+ (3.5GB+) | PostgreSQL | $$ | Small-medium production |
| **Docker** | Custom | Redis/PostgreSQL | $ | Flexible hosting |
| **Kubernetes** | Multi-node cluster | Redis/PostgreSQL | $$$ | Enterprise, high-availability |
### Scaling Considerations
**Vertical Scaling (Single Instance):**
- Increase CPU/RAM allocation
- Use Memory or Redis session store
- Suitable for up to 1000 concurrent users
**Horizontal Scaling (Multiple Instances):**
- Deploy multiple replicas behind load balancer
- **Required:** Redis or PostgreSQL session store
- Configure sticky sessions (optional, for performance)
- Use Azure Front Door or Application Gateway
- Suitable for 1000+ concurrent users
**Performance Optimization:**
- Enable LangFuse for monitoring bottlenecks
- Use Redis for caching frequent queries
- Optimize Azure Search index (partition keys, replicas)
- Consider Azure OpenAI provisioned throughput for high volume
---
## Testing
The project uses **PyTest** for comprehensive testing with both unit and integration tests.
### Test Structure
```
tests/
├── unit/ # Unit tests (isolated components)
│ ├── test_observability.py # LangFuse client and metrics
│ ├── test_inventory_tools.py # Inventory tool functions
│ ├── test_session_store.py # Session persistence
│ ├── test_retriever.py # RAG retrieval logic
│ └── test_data_generator.py # Synthetic data generation
├── integration/ # Integration tests (multiple components)
│ └── test_state_manager.py # LangGraph state machine
└── test_chatassistant_retail.py # Main chatbot tests
```
### Running Tests
```bash
# Run all tests
just test
# Run with verbose output
just test -v
# Run specific test file
just test tests/unit/test_observability.py
# Run specific test function
just test tests/unit/test_observability.py::TestLangFuseClient::test_get_langfuse_client_disabled
# Run with keyword filter
just test -k "inventory"
# Test on all Python versions (3.10, 3.11, 3.12, 3.13)
just testall
```
### Running Tests with Debugger
Use `ipdb` debugger on test failures:
```bash
# Drop into debugger on first failure
just pdb
# Debug specific test
just pdb tests/unit/test_inventory_tools.py
# Limit to first 10 failures
pytest --pdb --maxfail=10
```
### Coverage Reporting
```bash
# Run tests with coverage
just coverage
# View coverage report in terminal
coverage report
# Generate HTML coverage report
coverage html
# Open htmlcov/index.html in browser
```
Target coverage: **>= 90%**
### Test Fixtures
PyTest fixtures are used for common test setup:
```python
import pytest
@pytest.fixture
def mock_langfuse_client():
"""Provide mocked LangFuse client."""
from unittest.mock import MagicMock
return MagicMock()
@pytest.fixture
async def inventory_tool():
"""Provide inventory tool instance."""
from chatassistant_retail.tools import InventoryTools
return InventoryTools()
def test_inventory_query(inventory_tool):
result = inventory_tool.check_stock("SKU-123")
assert result["stock_level"] > 0
```
### Async Testing
Tests for async functions use `pytest-asyncio`:
```python
import pytest
@pytest.mark.asyncio
async def test_async_llm_call():
from chatassistant_retail.llm import AzureOpenAIClient
client = AzureOpenAIClient()
response = await client.chat("Test query")
assert response is not None
```
### Mocking External Services
Use `pytest-mock` for mocking Azure services:
```python
def test_azure_search(mocker):
# Mock Azure Cognitive Search
mock_search = mocker.patch("azure.search.documents.SearchClient")
mock_search.return_value.search.return_value = [
{"sku": "SKU-123", "name": "Test Product"}
]
# Test retriever
from chatassistant_retail.rag import Retriever
retriever = Retriever()
results = retriever.search("test query")
assert len(results) == 1
```
### CI/CD Testing
GitHub Actions runs tests automatically:
```yaml
# .github/workflows/test.yml
- Run tests on Python 3.12 and 3.13
- Check code formatting with Ruff
- Verify type hints with ty
- Generate coverage report
```
View test results in GitHub Actions: https://github.com/samir72/chatassistant_retail/actions
---
## Contributing
Contributions are welcome! Please follow these guidelines:
### Getting Started
1. **Fork** the repository on GitHub
2. **Clone your fork**:
```bash
git clone https://github.com/your-username/chatassistant_retail.git
cd chatassistant_retail
```
3. **Install dependencies**:
```bash
uv sync
```
4. **Create a feature branch**:
```bash
git checkout -b feature/your-feature-name
```
### Development Process
1. **Make your changes** following the code standards below
2. **Add tests** for new functionality (maintain >= 90% coverage)
3. **Run QA checks**:
```bash
just qa # Format, lint, type-check, and test
```
4. **Update documentation** if needed (README, docstrings, CLAUDE.md)
5. **Commit your changes**:
```bash
git commit -m "Add feature: description"
```
6. **Push to your fork**:
```bash
git push origin feature/your-feature-name
```
7. **Submit a pull request** to the main repository
### Code Standards
- **Style Guide**: PEP 8 (enforced by Ruff)
- **Line Length**: 120 characters maximum
- **Type Hints**: Required for all function signatures
- **Docstrings**: Required for all public functions and classes (Google style)
- **Test Coverage**: >= 90% for all new code
- **Import Sorting**: Automatic with Ruff (isort rules)
#### Example Code Style
```python
from typing import Optional
from chatassistant_retail.observability import trace
@trace(name="example_function", trace_type="function")
async def example_function(param1: str, param2: int = 0) -> Optional[dict]:
"""
Brief description of function.
Args:
param1: Description of param1
param2: Description of param2 (default: 0)
Returns:
Description of return value
Raises:
ValueError: When param2 is negative
"""
if param2 < 0:
raise ValueError("param2 must be non-negative")
return {"param1": param1, "param2": param2}
```
### Pull Request Guidelines
- **Title**: Clear, concise description (e.g., "Add purchase order export feature")
- **Description**: Explain what changed and why
- **Tests**: Include test results showing all tests pass
- **Coverage**: Show coverage hasn't decreased
- **Documentation**: Update README/docs if needed
- **Breaking Changes**: Clearly mark any breaking changes
### Reporting Bugs
Use GitHub Issues: https://github.com/samir72/chatassistant_retail/issues
Include:
- Python version
- Environment (OS, dependencies)
- Steps to reproduce
- Expected vs actual behavior
- Error messages/stack traces
- Minimal code example
---
## Credits
This package was created with [Cookiecutter](https://github.com/audreyfeldroy/cookiecutter) and the [audreyfeldroy/cookiecutter-pypackage](https://github.com/audreyfeldroy/cookiecutter-pypackage) project template.
### Technologies Used
- **LangGraph**: Agentic workflow orchestration
- **LangChain**: LLM framework and integrations
- **Azure OpenAI**: GPT-4o-mini language model
- **Azure Cognitive Search**: Vector search and RAG
- **LangFuse**: Observability and tracing
- **Gradio**: Web UI framework
- **FastMCP**: Model Context Protocol server
- **PyTest**: Testing framework
- **Ruff**: Python linter and formatter
- **uv**: Fast Python package manager
---
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---