--- title: chatassistant_retail emoji: 🦄 colorFrom: indigo colorTo: pink sdk: gradio short_description: Web based conversational AI chatbot designed specifically... --- # chatassistant_retail ![PyPI version](https://img.shields.io/pypi/v/chatassistant_retail.svg) [![Documentation Status](https://readthedocs.org/projects/chatassistant_retail/badge/?version=latest)](https://chatassistant_retail.readthedocs.io/en/latest/?version=latest) **chatassistant_retail** is a production-ready conversational AI chatbot designed specifically for the retail industry, providing intelligent assistance for retail inventory management. It features a multi-modal interface (text + images) powered by Azure OpenAI GPT-4o-mini, hybrid RAG search with Azure Cognitive Search, stateful conversation management via LangGraph, and flexible session persistence (Memory/Redis/PostgreSQL). The system includes a Gradio-based web UI, MCP tool integration, and comprehensive observability with LangFuse. * Python Version: >= 3.10 (tested on 3.10, 3.11, 3.12, and 3.13) --- ## Table of Contents - [Features](#features) - [Architecture](#architecture) - [Project Structure](#project-structure) - [Installation](#installation) - [Usage](#usage) - [Development](#development) - [Scripts Reference](#scripts-reference) - [Session Management](#session-management) - [Multi-Modal Processing](#multi-modal-processing) - [Observability](#observability) - [Deployment Options](#deployment-options) - [Testing](#testing) - [Contributing](#contributing) - [Credits](#credits) --- ## Features ### Core Capabilities - **Conversational Interface**: Gradio-based web UI for natural language interactions with retail inventory systems - **Retail Inventory Management**: Specialized chatbot for handling inventory queries, stock levels, and purchase orders - **Natural Language Understanding**: Powered by Azure OpenAI (GPT-4o-mini) for understanding and responding to retail-related questions - **Agentic Workflow**: LangGraph-based state machine for complex multi-step conversations - **Tool Integration**: MCP (Model Context Protocol) server for inventory and purchase order tools ### Technical Features - **Multi-Modal Input Processing**: Handle both text and images (PNG, JPG, JPEG, WebP) for product analysis and visual queries - **Context Caching System**: Smart data reuse across conversation turns reduces redundant I/O operations and improves response times - **Image-Based Product Lookup**: Upload product images for AI-powered identification, catalog matching, and automated inventory checking - **Low-Stock Automation**: Automatic reorder recommendations when visually identified products are low in stock - **LangGraph Orchestration**: Stateful conversation management with persistent session storage - **Hybrid RAG Search**: Vector + keyword + semantic search via Azure Cognitive Search with automatic fallback to local data - **Flexible Session Persistence**: Three backend options - Memory (fast, ephemeral), Redis (distributed), or PostgreSQL (full persistence) - **Semantic Search**: AI-powered relevance ranking with Free tier support (1,000 queries/month) - **Observability**: Built-in LangFuse integration for tracing, monitoring, and analytics - **Async Processing**: Asynchronous operations for high-performance request handling - **Graceful Fallbacks**: Automatic degradation when Azure services unavailable (local data, keyword search) ### Retail-Specific Features - **Inventory Queries**: Check stock levels, product availability, and warehouse information - **Purchase Order Management**: Create, track, and manage purchase orders - **Sample Data Generation**: Generate realistic product catalogs and sales history using Faker (500+ products, 6 months sales) - **Product Search**: Semantic search across product catalog using Azure Cognitive Search with visual product matching ### Deployment & Development - **Deployment Flexibility**: Local development, HuggingFace Spaces, or production deployment (Azure App Service, Docker, K8s) - **Development Tools**: Comprehensive test suite, data generation scripts, Azure Search setup automation - **Multi-Environment Configuration**: Environment-based settings with validation and graceful fallbacks --- ## Architecture The chatbot follows an agentic architecture pattern using LangGraph for state management and orchestration: ``` ┌─────────────────────────────────────────────────────────┐ │ Gradio Web Interface │ │ ┌─────────────┐ ┌───────────────┐ │ │ │ Chat UI │ │ Session │ │ │ │ │ │ Management │ │ │ └─────────────┘ └───────────────┘ │ └────────────────────────┬────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────┐ │ LangGraph State Manager │ │ ┌──────────────────┐ ┌──────────────────────────┐ │ │ │ State Graph │ │ Session Store │ │ │ │ (Workflow) │ │ (Memory/PostgreSQL) │ │ │ └──────────────────┘ └──────────────────────────┘ │ └────────────────────────┬────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────┐ │ Business Logic Layer │ │ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐ │ │ │ Azure │ │ Inventory │ │ Purchase │ │ │ │ OpenAI │ │ Tools │ │ Order Tools │ │ │ └──────────────┘ └─────────────┘ └──────────────┘ │ └────────────────────────┬────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────┐ │ Data/Integration Layer │ │ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐ │ │ │ PostgreSQL │ │ Azure │ │ Redis │ │ │ │ (Sessions) │ │ Search │ │ (Cache) │ │ │ └──────────────┘ └─────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────┐ │ Observability Layer │ │ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐ │ │ │ LangFuse │ │ Metrics │ │ Python │ │ │ │ Tracing │ │ Collector │ │ Logging │ │ │ └──────────────┘ └─────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────┘ ``` ### Key Components - **Gradio UI** (`ui/`): Web-based chat interface with multi-modal input (text + images) - **LangGraph State Manager** (`state/`): Conversation state management with Memory/Redis/PostgreSQL session stores - **Workflow Orchestration** (`workflow/`): Image-based product lookup with multi-step automation (vision → search → inventory → recommendations) - **Context Utilities** (`tools/context_utils.py`): Smart caching layer for performance optimization and data coherence - **Azure OpenAI Client** (`llm/`): Multi-modal LLM integration (GPT-4o-mini) with prompt templates and response parsing - **Context-Aware Tools** (`tools/`): Inventory and purchase order tools with optional state parameter for intelligent cache reuse - **RAG System** (`rag/`): Hybrid search with Azure Cognitive Search (vector + keyword + semantic) and local fallback - **Observability** (`observability/`): LangFuse tracing and metrics collection across all components - **Data Models** (`data/`): Pydantic models for products, sales, and purchase orders with sample data generation ### Design Principles - **Separation of Concerns**: Clear separation between UI, orchestration, business logic, and data layers - **Stateful Conversations**: LangGraph manages conversation state with checkpointing - **Tool-Based Architecture**: LLM invokes tools (inventory queries, purchase orders) through structured outputs - **Observable by Default**: All LLM calls and tool invocations traced with LangFuse - **Error Resilience**: Graceful degradation and comprehensive error handling --- ## Project Structure This project uses the **src-layout** pattern for better development and testing practices: ``` chatassistant_retail/ ├── app.py # HuggingFace Spaces entry point │ ├── src/ │ └── chatassistant_retail/ │ ├── __init__.py # Package initialization │ ├── __main__.py # Application entry point │ ├── cli.py # CLI entry point │ ├── chatbot.py # Main chatbot orchestrator (multi-modal) │ │ │ ├── ui/ # Gradio web interface │ │ ├── __init__.py │ │ ├── gradio_app.py # Main Gradio application │ │ ├── chat_interface.py # Chat UI components │ │ └── metrics_dashboard.py # Observability dashboard (UI currently disabled) │ │ │ ├── state/ # LangGraph state management │ │ ├── __init__.py │ │ ├── langgraph_manager.py # State graph orchestration │ │ ├── session_store.py # Abstract session interface │ │ ├── memory_store.py # In-memory store (HF Spaces) │ │ ├── redis_store.py # Redis store (distributed) │ │ └── postgresql_store.py # PostgreSQL store (persistent) │ │ │ ├── llm/ # LLM integration │ │ ├── __init__.py │ │ ├── azure_openai_client.py # Azure OpenAI client (GPT-4o-mini) │ │ ├── prompt_templates.py # System/user prompts │ │ └── response_parser.py # Response parsing │ │ │ ├── workflow/ # NEW: Workflow orchestration │ │ ├── __init__.py │ │ └── image_processor.py # Image-based product lookup │ │ │ ├── tools/ # Inventory & PO tools │ │ ├── __init__.py │ │ ├── context_utils.py # NEW: Context caching utilities │ │ ├── inventory_tools.py # UPDATED: Context-aware inventory operations │ │ ├── purchase_order_tools.py # UPDATED: Context-aware PO operations │ │ └── mcp_server.py # UPDATED: MCP server with state passing │ │ │ ├── rag/ # Azure Cognitive Search RAG │ │ ├── __init__.py │ │ ├── azure_search_client.py # Hybrid search client (vector+keyword+semantic) │ │ ├── retriever.py # Document retrieval with fallback │ │ └── embeddings.py # Embedding generation │ │ │ ├── data/ # Data models and generation │ │ ├── __init__.py │ │ ├── models.py # Product, Sale, PurchaseOrder models │ │ └── generator.py # Sample data generator (Faker) │ │ │ ├── observability/ # LangFuse observability │ │ ├── __init__.py │ │ ├── langfuse_client.py # LangFuse wrapper │ │ ├── decorators.py # @trace decorator │ │ └── metrics_collector.py # Metrics aggregation │ │ │ └── config/ # Configuration │ ├── __init__.py │ ├── settings.py # Pydantic settings (env-based) │ └── deployment.py # Deployment configs │ ├── data/ # Sample data files │ ├── products.json # 500+ sample products (216KB) │ ├── sales_history.json # 6 months sales data (3.5MB) │ └── purchase_orders.json # Sample purchase orders │ ├── scripts/ # Utility scripts │ ├── setup_azure_search.py # Azure Search index setup │ ├── generate_sample_data.py # Generate sample product/sales data │ ├── test_gradio_ui.py # UI testing script │ ├── test_phase2.py # Integration testing │ └── test_phase3.py # E2E scenario testing │ ├── tests/ │ ├── __init__.py │ ├── unit/ # Unit tests │ │ ├── test_context_utils.py # NEW: Context caching tests │ │ ├── test_image_processor.py # NEW: Image workflow tests │ │ ├── test_observability.py │ │ ├── test_inventory_tools.py │ │ ├── test_session_store.py │ │ ├── test_retriever.py │ │ ├── test_azure_search_client.py │ │ ├── test_azure_openai_client.py │ │ ├── test_mcp_server.py │ │ └── test_data_generator.py │ ├── integration/ # Integration tests │ │ ├── test_tool_context_integration.py # NEW: Context-aware tool tests │ │ └── test_state_manager.py │ └── test_chatassistant_retail.py # Main tests │ ├── docs/ # Sphinx documentation │ ├── conf.py │ ├── index.rst │ └── usage.rst │ ├── .github/ │ └── workflows/ │ └── test.yml # CI/CD pipeline │ ├── pyproject.toml # Project metadata and dependencies ├── justfile # Task automation ├── CLAUDE.md # Claude Code guidance ├── README.md # This file ├── HISTORY.md # Changelog └── LICENSE # MIT License ``` ### Key Directories - **app.py**: HuggingFace Spaces entry point (sets deployment mode and launches Gradio on 0.0.0.0:7860) - **ui/**: Gradio-based web interface with multi-modal chat - **state/**: LangGraph state machine with three session backends (Memory/Redis/PostgreSQL) - **llm/**: Azure OpenAI integration (GPT-4o-mini) with prompt engineering and multi-modal support - **tools/**: Inventory and purchase order tools with MCP server integration - **rag/**: Hybrid search with Azure Cognitive Search (vector+keyword+semantic) and local fallback - **data/**: Pydantic data models and Faker-based sample data generation - **observability/**: LangFuse tracing, metrics collection, and monitoring - **config/**: Pydantic settings with environment variable support and validation - **scripts/**: Setup scripts (Azure Search index, data generation, testing) - **data/ (root)**: Sample JSON files (products, sales history, purchase orders) - **tests/**: Comprehensive PyTest suite (unit and integration tests) --- ## Installation ### Prerequisites - Python >= 3.10 - [uv](https://github.com/astral-sh/uv) (Rust-based Python package manager) - **Required** - [just](https://github.com/casey/just) (Command runner for task automation) - **Required for development** - Azure OpenAI API access - Azure Cognitive Search instance (optional, for RAG) - PostgreSQL database (optional, for persistent sessions) - Redis instance (optional, for caching) - LangFuse account (optional, for observability) #### Installing uv and just Both `uv` and `just` need to be installed system-wide (not in a virtual environment): **macOS:** ```bash # Install with Homebrew (recommended) brew install uv just # Or install uv via curl curl -LsSf https://astral.sh/uv/install.sh | sh # And install just separately brew install just ``` **Linux:** ```bash # Install uv curl -LsSf https://astral.sh/uv/install.sh | sh # Install just curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash ``` **Windows:** ```powershell # Install uv powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" # Install just with cargo cargo install just ``` After installation, verify both are available: ```bash uv --version # Should show: uv 0.9.x or later just --version # Should show: just 1.x.x or later ``` ### Install from PyPI (when published) ```bash pip install chatassistant_retail ``` ### Development Installation 1. Clone the repository: ```bash git clone https://github.com/samir72/chatassistant_retail.git cd chatassistant_retail ``` 2. Install dependencies with uv: ```bash uv sync ``` 3. Set up environment variables (create `.env` file): ```bash # Azure OpenAI (Required) AZURE_OPENAI_API_KEY=your-api-key AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com/ AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini # GPT-4o-mini deployment AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-ada-002 # For RAG embeddings AZURE_OPENAI_API_VERSION=2024-02-15-preview # Azure Cognitive Search (Optional - fallback to local data if not configured) AZURE_COGNITIVE_SEARCH_ENDPOINT=https://your-search.search.windows.net AZURE_COGNITIVE_SEARCH_API_KEY=your-search-key AZURE_SEARCH_INDEX_NAME=products # Note: Semantic search must be enabled in Azure Portal (Search Service → Semantic ranker → Free) # Session Persistence (Optional - defaults to Memory store) SESSION_STORE_TYPE=memory # Options: memory, redis, postgresql REDIS_URL=redis://localhost:6379/0 # If using redis POSTGRES_CONNECTION_STRING=postgresql://user:password@localhost:5432/chatbot # If using postgresql # Deployment Configuration (Optional) DEPLOYMENT_MODE=local # Options: local, hf_spaces LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR # LangFuse Observability (Optional - recommended for production) LANGFUSE_PUBLIC_KEY=pk-lf-... LANGFUSE_SECRET_KEY=sk-lf-... LANGFUSE_HOST=https://cloud.langfuse.com LANGFUSE_ENABLED=true ``` 4. Verify installation: ```bash just test ``` --- ## Sample Data The repository includes **sample data files** for immediate testing and development: | File | Size | Description | |------|------|-------------| | `data/products.json` | 220 KB | 500+ sample products across 8 retail categories (Electronics, Clothing, Groceries, etc.) | | `data/sales_history.json` | 3.6 MB | 6 months of sales transaction history with seasonal patterns | | `data/purchase_orders.json` | 1.3 KB | Sample purchase orders (pending/fulfilled statuses) | **Total size:** ~3.7 MB ### Why Sample Data is Included Previously, the `data/` directory was excluded from version control to keep the repository lean. As of version 0.1.1, sample data is **included by default** for: - ✅ Faster setup for new developers - ✅ Consistent test data across environments - ✅ No need to run data generation scripts initially ### Regenerating Sample Data (Optional) If you want to regenerate or customize sample data: ```bash python scripts/generate_sample_data.py ``` This will create fresh sample data with configurable parameters. --- ## Usage ### Setting Up Azure Cognitive Search If you're using Azure Cognitive Search for RAG (Retrieval-Augmented Generation), you need to create the search index before running the chatbot. #### Prerequisites - Azure Cognitive Search service created - Environment variables configured: - `AZURE_COGNITIVE_SEARCH_ENDPOINT` - `AZURE_COGNITIVE_SEARCH_API_KEY` - `AZURE_SEARCH_INDEX_NAME` (optional, defaults to "products") #### Create the Index Run the setup script: ```bash python scripts/setup_azure_search.py ``` This will: 1. Create the "products" index with the proper schema 2. Configure vector search (1536-dimensional embeddings) 3. Set up semantic search capabilities 4. Optionally load 500 sample products with embeddings #### Verify Setup Check index health: ```python from chatassistant_retail.rag import AzureSearchClient from chatassistant_retail.config import get_settings client = AzureSearchClient(get_settings()) health = client.check_index_health() print(f"Status: {health['overall_status']}") print(f"Document count: {health['stats']['document_count']}") ``` **Note:** If the index doesn't exist, the chatbot will still work but will fall back to local product data without vector search capabilities. --- ### Running the Chatbot #### Start Gradio Web Interface ```bash # Using Python module python -m chatassistant_retail # Or using the CLI entry point chatassistant-retail # With custom port chatassistant-retail --port 7860 ``` The web interface will be available at `http://localhost:7860` with: - **Chat Interface**: Main conversational UI with multi-modal support (text + images) #### Command-Line Usage ```bash # Interactive CLI mode chatassistant-retail --cli # Single query chatassistant-retail --query "What is the stock level for SKU-12345?" ``` ### Example Conversations #### Inventory Query ``` User: What is the current stock level for SKU-12345? Assistant: Let me check the inventory for SKU-12345... [Tool: inventory_tools.check_stock_level(sku="SKU-12345")] The current stock level for SKU-12345 is 150 units across 3 warehouses. ``` #### Purchase Order Management ``` User: Create a purchase order for 500 units of SKU-67890 from Supplier ABC Assistant: I'll create a purchase order for you... [Tool: purchase_order_tools.create_po(sku="SKU-67890", quantity=500, supplier="ABC")] Purchase order PO-2025-001 created successfully for 500 units of SKU-67890. Expected delivery: 2025-01-15 ``` #### Product Search with RAG ``` User: Tell me about our winter jacket inventory Assistant: Let me search our product catalog... [RAG: Azure Cognitive Search retrieval with semantic search] We have 5 winter jacket styles in stock with a total of 1,250 units. Most popular: "Alpine Puffer Jacket" (SKU-WJ-001) with 450 units available. ``` #### Multi-Modal Query (Image + Text) ``` User: [Uploads product image] Is this the same as SKU-12345? What's the stock level? Assistant: Analyzing the product image... [Multi-Modal: GPT-4o-mini processing] [Tool: inventory_tools.check_stock_level(sku="SKU-12345")] Yes, this appears to be the same product as SKU-12345 (Blue Athletic Sneaker). Current stock: 85 units available across 2 warehouses. ``` #### Image-Based Product Workflow (NEW) ``` User: [Uploads product image] "Check inventory for this item" Assistant: Let me analyze the product image... [ImageProductProcessor workflow] - Vision Extraction: Identifies "Wireless Optical Mouse" (Electronics, confidence: 0.89) - Catalog Search: Finds 2 matching products via RAG - Inventory Check: * SKU-10001: 50 units (adequate stock) * SKU-10002: 8 units (below reorder level of 20) - Reorder Recommendation: Order 50 units of SKU-10002 (HIGH urgency, 5 days until stockout) Response: "🔍 Product Identification Results I identified: Wireless Optical Mouse Category: Electronics 📦 Matching Products in Inventory: 1. Wireless Optical Mouse (SKU-10001) - Price: $24.99 - Current Stock: 50 units - Reorder Level: 10 units - Status: OK - Supplier: TechSupplies Inc 2. Ergonomic Wireless Mouse (SKU-10002) - Price: $34.99 - Current Stock: 8 units - Reorder Level: 20 units - Status: LOW STOCK - Supplier: ComfortTech 💡 Recommendations: ⚠️ Ergonomic Wireless Mouse (SKU-10002) is running low: - Days until stockout: 5 - Suggested order quantity: 50 units - Urgency: HIGH Would you like me to create a purchase order?" ``` --- ## Development This project uses `just` for task automation. All development commands are defined in the `justfile`. ### Quick Start ```bash # List all available commands just list # Run full QA pipeline (format, lint, type-check, test) just qa # Run tests just test # Run tests with debugger on failure just pdb # Generate coverage report just coverage ``` ### Development Workflow 1. **Make changes** to the codebase 2. **Run QA**: `just qa` (formats, lints, type-checks, and tests) 3. **Debug failures**: `just pdb` if tests fail 4. **Check coverage**: `just coverage` to ensure adequate test coverage 5. **Build package**: `just build` when ready to release ### Code Quality Tools - **Ruff**: Fast Python linter and formatter (line length: 120) - **isort**: Import sorting (integrated with Ruff) - **ty**: Type checking with all rules enabled - **pytest**: Testing framework with async support and coverage reporting ### Local Development Setup ```bash # Activate virtual environment (if not using uv) source .venv/bin/activate # Install pre-commit hooks (optional) pre-commit install # Run development server with auto-reload python -m chatassistant_retail --reload # Run tests in watch mode pytest-watch ``` ### Data Generation The project includes tools for generating realistic sample data for development and testing. #### Generating Sample Data Use the `generate_sample_data.py` script to create sample products and sales history: ```bash # Generate default data (500 products, 6 months sales history) python scripts/generate_sample_data.py # Custom data generation python scripts/generate_sample_data.py --count 1000 --months 12 ``` #### What Gets Generated **Products (`data/products.json`):** - 500+ realistic retail products (configurable) - Categories: Electronics, Clothing, Home & Garden, Sports, Books - Realistic pricing, descriptions, and metadata - Auto-generated embeddings for Azure Search (1536 dimensions) **Sales History (`data/sales_history.json`):** - 6 months of transactional data (configurable) - Realistic sales patterns (seasonality, trends) - Multiple warehouses and channels - Customer demographics **Purchase Orders (`data/purchase_orders.json`):** - Sample PO data for testing - Various suppliers and statuses - Delivery tracking information #### Integration with Azure Search The generated data includes pre-computed embeddings for immediate upload to Azure Search: ```bash # 1. Generate sample data with embeddings python scripts/generate_sample_data.py # 2. Upload to Azure Search python scripts/setup_azure_search.py --load-data ``` #### Customization The data generator uses [Faker](https://faker.readthedocs.io/) for realistic data generation. Customize by modifying `src/chatassistant_retail/data/generator.py`. --- ## Scripts Reference The `scripts/` directory contains utility scripts for setup, testing, and data management. ### setup_azure_search.py **Purpose:** Create and configure Azure Cognitive Search index with proper schema for RAG. **Usage:** ```bash # Create index (prompts to load sample data) python scripts/setup_azure_search.py # Create index and auto-load data python scripts/setup_azure_search.py --load-data # Recreate existing index python scripts/setup_azure_search.py --recreate ``` **What It Does:** - Creates "products" index with hybrid search configuration - Configures vector search (HNSW algorithm, 1536 dimensions) - Sets up semantic search capabilities - Optionally loads 500 sample products with embeddings - Uploads in batches of 100 for efficiency - Performs health check and displays statistics **Requirements:** - `AZURE_COGNITIVE_SEARCH_ENDPOINT` environment variable - `AZURE_COGNITIVE_SEARCH_API_KEY` environment variable - `AZURE_OPENAI_EMBEDDING_DEPLOYMENT` for embedding generation ### generate_sample_data.py **Purpose:** Generate realistic retail data for development and testing. **Usage:** ```bash # Default: 500 products, 6 months sales python scripts/generate_sample_data.py # Custom counts python scripts/generate_sample_data.py --count 1000 --months 12 # Dry run (don't save files) python scripts/generate_sample_data.py --dry-run ``` **Output:** - `data/products.json` - Product catalog with embeddings - `data/sales_history.json` - Sales transactions - `data/purchase_orders.json` - PO data ### test_gradio_ui.py **Purpose:** Interactive testing of Gradio UI components. **Usage:** ```bash python scripts/test_gradio_ui.py ``` Launches the Gradio interface for manual testing and validation. ### test_phase2.py / test_phase3.py **Purpose:** Integration and end-to-end testing scripts. **Usage:** ```bash # Integration testing python scripts/test_phase2.py # E2E scenario testing python scripts/test_phase3.py ``` Tests the complete chatbot workflow including LangGraph state management, tool execution, and RAG retrieval. --- ## Session Management The chatbot supports three different session storage backends for conversation state persistence, allowing you to choose the right balance between simplicity, performance, and durability. ### Session Store Backends #### Memory Store (Default) The in-memory session store is the default and simplest option, ideal for development and HuggingFace Spaces deployment. **Characteristics:** - ✅ **Fast**: No network latency, instant access - ✅ **Simple**: No external dependencies required - ✅ **Auto-configured**: Default for HF Spaces deployment - ❌ **Ephemeral**: Sessions lost on restart - ❌ **Single-instance**: Not shared across multiple app instances **Configuration:** ```bash SESSION_STORE_TYPE=memory # or omit (default) ``` **Use When:** - Developing locally - Deploying to HuggingFace Spaces - Session persistence not critical - Running single instance #### Redis Store Redis provides distributed session storage with persistence, ideal for production deployments with multiple instances. **Characteristics:** - ✅ **Fast**: In-memory with disk persistence - ✅ **Distributed**: Shared across multiple app instances - ✅ **Persistent**: Survives app restarts (with RDB/AOF) - ✅ **TTL Support**: Automatic session expiration - ⚠️ **Requires Redis**: External service dependency **Configuration:** ```bash SESSION_STORE_TYPE=redis REDIS_URL=redis://localhost:6379/0 # Or for production with auth: REDIS_URL=redis://:password@redis-host:6379/0 ``` **Use When:** - Running multiple app instances (load balanced) - Need distributed session sharing - Want automatic session expiration - Production deployment with high availability #### PostgreSQL Store PostgreSQL provides full persistence with queryable session history, ideal for audit requirements and analytics. **Characteristics:** - ✅ **Fully Persistent**: Durable storage with ACID guarantees - ✅ **Queryable**: SQL access to session data and history - ✅ **Audit Trail**: Complete conversation history - ✅ **Backup/Recovery**: Standard database backup tools - ⚠️ **Slower**: Disk I/O overhead vs in-memory stores - ⚠️ **Requires PostgreSQL**: External database dependency **Configuration:** ```bash SESSION_STORE_TYPE=postgresql POSTGRES_CONNECTION_STRING=postgresql://user:password@localhost:5432/chatbot ``` **Use When:** - Need complete audit trail - Compliance/regulatory requirements - Want to query conversation history - Long-term session retention needed - Analytics on conversation patterns ### Configuration Examples **Local Development (Memory):** ```bash # .env SESSION_STORE_TYPE=memory # Fast, simple, ephemeral ``` **HuggingFace Spaces (Memory):** ```bash # Automatically configured via app.py DEPLOYMENT_MODE=hf_spaces SESSION_STORE_TYPE=memory # Default for HF Spaces ``` **Production (Redis):** ```bash # .env SESSION_STORE_TYPE=redis REDIS_URL=redis://:your-password@redis.example.com:6379/0 ``` **Enterprise (PostgreSQL):** ```bash # .env SESSION_STORE_TYPE=postgresql POSTGRES_CONNECTION_STRING=postgresql://chatbot:password@db.example.com:5432/chatbot ``` ### Choosing a Backend | Criteria | Memory | Redis | PostgreSQL | |----------|--------|-------|------------| | **Speed** | ⭐⭐⭐ Fastest | ⭐⭐ Very Fast | ⭐ Fast | | **Persistence** | ❌ None | ⭐⭐ Configurable | ⭐⭐⭐ Full | | **Multi-Instance** | ❌ No | ✅ Yes | ✅ Yes | | **Setup Complexity** | ⭐⭐⭐ None | ⭐⭐ Moderate | ⭐ Complex | | **Cost** | Free | $ Low | $$ Moderate | | **Best For** | Dev, HF Spaces | Production | Enterprise, Audit | --- ## Multi-Modal Processing The chatbot supports multi-modal input, allowing users to send both text and images for visual product analysis, comparison, and identification. ### Overview Powered by Azure OpenAI GPT-4o-mini, the chatbot can: - Analyze product images to identify items - Compare products visually against catalog images - Extract product details from photos (color, style, features) - Verify product authenticity and condition - Assist with visual inventory checks ### Supported Formats **Image Formats:** - PNG (.png) - JPEG (.jpg, .jpeg) - WebP (.webp) **Size Limits:** - Maximum file size: 20MB (Azure OpenAI limit) - Recommended resolution: 2048x2048 pixels or less - Images automatically resized if too large ### Usage Examples #### Text-Only Query ```python # Via Gradio UI: Type in chat box User: "What is the stock level for SKU-12345?" ``` #### Image + Text Query ```python # Via Gradio UI: Click image upload button, select image, then type query User: [Uploads product photo] "Is this product in our catalog? Check inventory?" ``` Assistant analyzes the image using GPT-4o-mini, compares it against the catalog, and provides inventory information. #### Product Image Analysis ```python User: [Uploads warehouse photo showing multiple items] "Identify all products in this image and check stock levels" ``` The chatbot can identify multiple products in a single image and provide bulk inventory information. ### Best Practices **Image Quality:** - Use clear, well-lit photos - Ensure products are centered and in focus - Avoid excessive image compression **Query Construction:** - Combine images with specific questions for best results - Reference SKUs or product names when known - Ask focused questions (inventory, identification, comparison) **Supported Use Cases:** - ✅ Product identification from photos - ✅ Visual comparison against catalog - ✅ Quality/authenticity verification - ✅ Bulk identification from warehouse photos - ❌ Image generation or editing (not supported) --- ## Observability The chatbot includes comprehensive observability using **LangFuse** for distributed tracing and monitoring. ### LangFuse Integration LangFuse is integrated throughout the application for automatic tracing of: - **LLM Calls**: All Azure OpenAI requests with prompts, completions, and token usage - **Tool Invocations**: Inventory queries and purchase order operations - **State Transitions**: LangGraph state changes and workflow steps - **RAG Operations**: Document retrieval and embedding generation ### Configuration Set up LangFuse in your `.env` file: ```bash LANGFUSE_PUBLIC_KEY=pk-lf-your-public-key LANGFUSE_SECRET_KEY=sk-lf-your-secret-key LANGFUSE_HOST=https://cloud.langfuse.com # or self-hosted LANGFUSE_ENABLED=true ``` ### Using the @trace Decorator Automatically trace any function: ```python from chatassistant_retail.observability import trace @trace(name="inventory_check", trace_type="tool") async def check_inventory(sku: str): # Function automatically traced in LangFuse return await inventory_service.get_stock(sku) ``` ### Manual Tracing (Advanced Usage) **Note:** For most use cases, the `@trace` decorator is recommended. Use manual tracing only when you need fine-grained control over span lifecycle. ```python from chatassistant_retail.observability import create_span, log_event # Create a span (must call .end() when done) span = create_span( name="complex_workflow", input_data={"query": "user input"}, metadata={"user_id": "123"} ) try: # Do work result = perform_operation() # Log events within the span log_event( name="operation_milestone", level="INFO", input_data={"checkpoint": "halfway"} ) # Update span with output span.update(output={"result": result}) finally: # Always end the span span.end() ``` **Important:** Spans created with `create_span()` must be explicitly ended with `.end()` to avoid memory leaks. Use the `@trace` decorator for automatic lifecycle management. ### Metrics Dashboard **Status:** The Gradio UI metrics dashboard is currently disabled. **Access Metrics:** - **LangFuse Web Dashboard:** https://cloud.langfuse.com (recommended for production monitoring) - **Programmatic Access:** Use the `MetricsCollector` class directly (see Metrics Collection section below) All observability infrastructure remains active. The `MetricsCollector` class continues to aggregate data from LangFuse traces for programmatic access. ### Metrics Collection The `MetricsCollector` class aggregates data from LangFuse traces: ```python from chatassistant_retail.observability import MetricsCollector collector = MetricsCollector() metrics = collector.get_dashboard_data() print(f"Total queries: {metrics['total_queries']}") print(f"Avg response time: {metrics['avg_response_time']:.2f}s") print(f"Success rate: {metrics['success_rate']:.1f}%") ``` ### Logging Structured logging with Python's `logging` module: ```python import logging logger = logging.getLogger(__name__) logger.info("Processing user query", extra={ "session_id": session_id, "query_length": len(query) }) ``` Log levels are configurable via environment variables: ```bash LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL ``` ### Monitoring in Production LangFuse provides: - **Request Tracing**: End-to-end visibility of each conversation - **Performance Metrics**: Latency, throughput, and error rates - **Cost Tracking**: Token usage and API costs per request - **User Analytics**: Session duration, query patterns, tool usage - **Error Analysis**: Exception tracking and debugging Access your LangFuse dashboard at https://cloud.langfuse.com to view: - Real-time trace explorer - Analytics dashboards - Cost reports - User session replays --- ## Deployment Options The chatbot supports multiple deployment scenarios, from local development to production hosting on cloud platforms. ### Local Development **Quick Start:** ```bash # Clone repository git clone https://github.com/samir72/chatassistant_retail.git cd chatassistant_retail # Install dependencies uv sync # Set up environment variables cp .env.example .env # Edit .env with your Azure credentials # Run locally python -m chatassistant_retail ``` **Features:** - Hot reload with `--reload` flag - Full debugging capabilities - All features enabled (Azure Search, Redis, PostgreSQL, LangFuse) - Access at `http://localhost:7860` **Session Storage:** Any (Memory, Redis, PostgreSQL) ### HuggingFace Spaces Deploy directly to HuggingFace Spaces for free hosting with automatic HTTPS and sharing. #### Prerequisites 1. HuggingFace account (https://huggingface.co) 2. Azure OpenAI API credentials 3. (Optional) Azure Cognitive Search for RAG #### Configuration The `app.py` file is pre-configured for HF Spaces deployment: ```python # app.py sets deployment mode automatically os.environ["DEPLOYMENT_MODE"] = "hf_spaces" ``` **Important: src-layout Workaround** Due to HuggingFace Spaces' Docker build process, this project uses a **sys.path manipulation workaround** instead of standard package installation: - `requirements.txt` installs only dependencies (no package self-installation via `.`) - `app.py` adds `src/` directory to Python path at startup - Imports work without formal package installation This is intentional and necessary because HF Spaces' auto-generated Dockerfile mounts `requirements.txt` before copying `pyproject.toml`, preventing standard `pip install .` from working. For local development, continue using `pip install -e .` as normal. **Environment Variables (HF Spaces Secrets):** ```bash # Required AZURE_OPENAI_API_KEY=... AZURE_OPENAI_ENDPOINT=... AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini # Recommended AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-ada-002 AZURE_COGNITIVE_SEARCH_ENDPOINT=... AZURE_COGNITIVE_SEARCH_API_KEY=... # Optional LANGFUSE_PUBLIC_KEY=... LANGFUSE_SECRET_KEY=... ``` #### Deployment Steps 1. **Create Space:** - Go to https://huggingface.co/new-space - Select "Gradio" as SDK - Choose "Public" or "Private" 2. **Upload Files:** ```bash git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME cd YOUR_SPACE_NAME cp -r chatassistant_retail/* . git add . git commit -m "Initial deployment" git push ``` 3. **Set Secrets:** - Go to Space Settings → Repository secrets - Add all required environment variables - Space will automatically rebuild and deploy 4. **Access:** - Your app will be live at: `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME` **Limitations:** - Uses Memory session store (sessions lost on restart) - No Redis or PostgreSQL (Spaces compute limitations) - Limited to 16GB RAM, 8 CPU cores (Free tier) - Automatic sleep after 48 hours of inactivity **Best For:** - Demos and prototypes - Sharing with stakeholders - Testing without infrastructure setup - Free tier hosting ### Production Deployment Deploy to cloud platforms for scalable, production-grade hosting. #### Azure App Service **Requirements:** - Azure subscription - Azure App Service (B1 or higher) - Azure OpenAI, Azure Cognitive Search, Azure Database for PostgreSQL **Steps:** ```bash # 1. Install Azure CLI az login # 2. Create App Service az webapp create \ --resource-group retail-chatbot-rg \ --plan retail-chatbot-plan \ --name retail-chatbot-app \ --runtime "PYTHON:3.12" # 3. Configure environment variables az webapp config appsettings set \ --resource-group retail-chatbot-rg \ --name retail-chatbot-app \ --settings \ DEPLOYMENT_MODE=production \ SESSION_STORE_TYPE=postgresql \ POSTGRES_CONNECTION_STRING="..." \ AZURE_OPENAI_API_KEY="..." # 4. Deploy az webapp up \ --resource-group retail-chatbot-rg \ --name retail-chatbot-app ``` **Recommended Configuration:** - **Compute:** App Service B2 or higher (3.5GB RAM) - **Session Store:** Azure Database for PostgreSQL - **Cache:** Azure Cache for Redis (optional) - **Monitoring:** Azure Application Insights + LangFuse #### Docker Deployment **Dockerfile Example:** ```dockerfile FROM python:3.12-slim WORKDIR /app # Install uv RUN pip install uv # Copy project files COPY . /app # Install dependencies RUN uv sync # Expose port EXPOSE 7860 # Run application CMD ["python", "-m", "chatassistant_retail"] ``` **Build and Run:** ```bash # Build image docker build -t retail-chatbot . # Run container docker run -p 7860:7860 \ -e AZURE_OPENAI_API_KEY="..." \ -e DEPLOYMENT_MODE=production \ retail-chatbot ``` #### Kubernetes Deployment For high-availability, multi-replica deployments: **Key Considerations:** - Use PostgreSQL or Redis for session storage (not Memory) - Configure horizontal pod autoscaling (HPA) - Set up ingress with TLS/SSL - Use Azure Key Vault for secrets management - Configure health checks and liveness probes **Session Store:** Redis or PostgreSQL (required for multi-replica) ### Architecture Recommendations | Deployment | Compute | Session Store | Cost | Best For | |------------|---------|---------------|------|----------| | **Local** | Developer machine | Memory | Free | Development | | **HF Spaces** | Free tier (16GB) | Memory | Free | Demos, prototypes | | **Azure App Service** | B2+ (3.5GB+) | PostgreSQL | $$ | Small-medium production | | **Docker** | Custom | Redis/PostgreSQL | $ | Flexible hosting | | **Kubernetes** | Multi-node cluster | Redis/PostgreSQL | $$$ | Enterprise, high-availability | ### Scaling Considerations **Vertical Scaling (Single Instance):** - Increase CPU/RAM allocation - Use Memory or Redis session store - Suitable for up to 1000 concurrent users **Horizontal Scaling (Multiple Instances):** - Deploy multiple replicas behind load balancer - **Required:** Redis or PostgreSQL session store - Configure sticky sessions (optional, for performance) - Use Azure Front Door or Application Gateway - Suitable for 1000+ concurrent users **Performance Optimization:** - Enable LangFuse for monitoring bottlenecks - Use Redis for caching frequent queries - Optimize Azure Search index (partition keys, replicas) - Consider Azure OpenAI provisioned throughput for high volume --- ## Testing The project uses **PyTest** for comprehensive testing with both unit and integration tests. ### Test Structure ``` tests/ ├── unit/ # Unit tests (isolated components) │ ├── test_observability.py # LangFuse client and metrics │ ├── test_inventory_tools.py # Inventory tool functions │ ├── test_session_store.py # Session persistence │ ├── test_retriever.py # RAG retrieval logic │ └── test_data_generator.py # Synthetic data generation ├── integration/ # Integration tests (multiple components) │ └── test_state_manager.py # LangGraph state machine └── test_chatassistant_retail.py # Main chatbot tests ``` ### Running Tests ```bash # Run all tests just test # Run with verbose output just test -v # Run specific test file just test tests/unit/test_observability.py # Run specific test function just test tests/unit/test_observability.py::TestLangFuseClient::test_get_langfuse_client_disabled # Run with keyword filter just test -k "inventory" # Test on all Python versions (3.10, 3.11, 3.12, 3.13) just testall ``` ### Running Tests with Debugger Use `ipdb` debugger on test failures: ```bash # Drop into debugger on first failure just pdb # Debug specific test just pdb tests/unit/test_inventory_tools.py # Limit to first 10 failures pytest --pdb --maxfail=10 ``` ### Coverage Reporting ```bash # Run tests with coverage just coverage # View coverage report in terminal coverage report # Generate HTML coverage report coverage html # Open htmlcov/index.html in browser ``` Target coverage: **>= 90%** ### Test Fixtures PyTest fixtures are used for common test setup: ```python import pytest @pytest.fixture def mock_langfuse_client(): """Provide mocked LangFuse client.""" from unittest.mock import MagicMock return MagicMock() @pytest.fixture async def inventory_tool(): """Provide inventory tool instance.""" from chatassistant_retail.tools import InventoryTools return InventoryTools() def test_inventory_query(inventory_tool): result = inventory_tool.check_stock("SKU-123") assert result["stock_level"] > 0 ``` ### Async Testing Tests for async functions use `pytest-asyncio`: ```python import pytest @pytest.mark.asyncio async def test_async_llm_call(): from chatassistant_retail.llm import AzureOpenAIClient client = AzureOpenAIClient() response = await client.chat("Test query") assert response is not None ``` ### Mocking External Services Use `pytest-mock` for mocking Azure services: ```python def test_azure_search(mocker): # Mock Azure Cognitive Search mock_search = mocker.patch("azure.search.documents.SearchClient") mock_search.return_value.search.return_value = [ {"sku": "SKU-123", "name": "Test Product"} ] # Test retriever from chatassistant_retail.rag import Retriever retriever = Retriever() results = retriever.search("test query") assert len(results) == 1 ``` ### CI/CD Testing GitHub Actions runs tests automatically: ```yaml # .github/workflows/test.yml - Run tests on Python 3.12 and 3.13 - Check code formatting with Ruff - Verify type hints with ty - Generate coverage report ``` View test results in GitHub Actions: https://github.com/samir72/chatassistant_retail/actions --- ## Contributing Contributions are welcome! Please follow these guidelines: ### Getting Started 1. **Fork** the repository on GitHub 2. **Clone your fork**: ```bash git clone https://github.com/your-username/chatassistant_retail.git cd chatassistant_retail ``` 3. **Install dependencies**: ```bash uv sync ``` 4. **Create a feature branch**: ```bash git checkout -b feature/your-feature-name ``` ### Development Process 1. **Make your changes** following the code standards below 2. **Add tests** for new functionality (maintain >= 90% coverage) 3. **Run QA checks**: ```bash just qa # Format, lint, type-check, and test ``` 4. **Update documentation** if needed (README, docstrings, CLAUDE.md) 5. **Commit your changes**: ```bash git commit -m "Add feature: description" ``` 6. **Push to your fork**: ```bash git push origin feature/your-feature-name ``` 7. **Submit a pull request** to the main repository ### Code Standards - **Style Guide**: PEP 8 (enforced by Ruff) - **Line Length**: 120 characters maximum - **Type Hints**: Required for all function signatures - **Docstrings**: Required for all public functions and classes (Google style) - **Test Coverage**: >= 90% for all new code - **Import Sorting**: Automatic with Ruff (isort rules) #### Example Code Style ```python from typing import Optional from chatassistant_retail.observability import trace @trace(name="example_function", trace_type="function") async def example_function(param1: str, param2: int = 0) -> Optional[dict]: """ Brief description of function. Args: param1: Description of param1 param2: Description of param2 (default: 0) Returns: Description of return value Raises: ValueError: When param2 is negative """ if param2 < 0: raise ValueError("param2 must be non-negative") return {"param1": param1, "param2": param2} ``` ### Pull Request Guidelines - **Title**: Clear, concise description (e.g., "Add purchase order export feature") - **Description**: Explain what changed and why - **Tests**: Include test results showing all tests pass - **Coverage**: Show coverage hasn't decreased - **Documentation**: Update README/docs if needed - **Breaking Changes**: Clearly mark any breaking changes ### Reporting Bugs Use GitHub Issues: https://github.com/samir72/chatassistant_retail/issues Include: - Python version - Environment (OS, dependencies) - Steps to reproduce - Expected vs actual behavior - Error messages/stack traces - Minimal code example --- ## Credits This package was created with [Cookiecutter](https://github.com/audreyfeldroy/cookiecutter) and the [audreyfeldroy/cookiecutter-pypackage](https://github.com/audreyfeldroy/cookiecutter-pypackage) project template. ### Technologies Used - **LangGraph**: Agentic workflow orchestration - **LangChain**: LLM framework and integrations - **Azure OpenAI**: GPT-4o-mini language model - **Azure Cognitive Search**: Vector search and RAG - **LangFuse**: Observability and tracing - **Gradio**: Web UI framework - **FastMCP**: Model Context Protocol server - **PyTest**: Testing framework - **Ruff**: Python linter and formatter - **uv**: Fast Python package manager --- ## License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ---