DataSprint / README.md
sujana05's picture
Upload folder using huggingface_hub
b4ce589 verified

A newer version of the Gradio SDK is available: 6.13.0

Upgrade
metadata
title: DataSprint
app_file: fleet_optimizer.py
sdk: gradio
sdk_version: 4.44.1

Demand Forecasting for Retail

This project predicts future sales per product, flags anomalies, and recommends optimal inventory levels for retailers. It features an interactive Gradio interface with chart visualizations, an offline LLM-powered chat (Ollama via LangChain), and vector store document retrieval capabilities.

Features

  • Time series forecasting (Prophet) with confidence intervals
  • Anomaly detection using statistical methods
  • Inventory recommendations with safety stock calculations
  • Interactive Plotly charts with enhanced visualizations
  • LLM-powered chat (offline via Ollama) with conversation memory
  • Vector store retrieval for retail knowledge base access
  • Large dataset support for realistic testing scenarios
  • Advanced LangChain features including chains, memory, and retrieval QA

Setup

  1. Install dependencies:

    pip install -r requirements.txt
    
  2. Install and run Ollama (https://ollama.com/):

    ollama run mistral
    
  3. Run the app:

    python app.py
    

Data

  • Small dataset: data/sales.csv (10 records for quick testing)
  • Large dataset: data/sales_large.csv (84 records with categories, regions, prices)
  • Knowledge base: data/retail_documents.txt (comprehensive retail analytics guide)

Usage

Basic Analytics

  • Select store and product from dropdowns
  • View interactive forecast charts with confidence intervals
  • Analyze anomalies and inventory recommendations

AI Chat Features

  • General questions: "Explain the forecast", "What anomalies do you see?"
  • Knowledge base queries: Check "Use Knowledge Base" for best practices
  • Comparison analysis: "Compare StoreA vs StoreB"
  • Business insights: "What are the key trends?"

Knowledge Base Access

The system includes a comprehensive retail knowledge base covering:

  • Sales forecasting best practices
  • Inventory management guidelines
  • Retail metrics and KPIs
  • Seasonal patterns by category
  • Anomaly detection methods
  • Business intelligence insights

Vector Store Features

  • Document retrieval: Search retail knowledge base
  • Context-aware responses: AI uses relevant documents for answers
  • Persistent storage: ChromaDB vector store with sentence transformers
  • Source attribution: Responses include source document information

Tech Stack

  • Python: Core language
  • Prophet: Time series forecasting
  • Plotly: Interactive visualizations
  • Gradio: Web interface
  • LangChain: LLM orchestration and chains
  • Ollama: Offline LLM (Mistral)
  • ChromaDB: Vector store for document retrieval
  • Sentence Transformers: Document embeddings

Advanced Features

LangChain Integration

  • Conversation Memory: Remembers chat history
  • Custom Chains: RetailAnalysisChain, SalesComparisonChain
  • Retrieval QA: Knowledge base question answering
  • Prompt Templates: Structured, reusable prompts
  • Streaming Responses: Real-time AI output

Vector Store Capabilities

  • Semantic Search: Find relevant retail knowledge
  • Document Chunking: Intelligent text splitting
  • Embedding Models: Sentence transformers for document encoding
  • Similarity Search: Retrieve contextually relevant information

Large Dataset Testing

  • Multiple stores: StoreA, StoreB, StoreC, StoreD
  • Product categories: Electronics, Clothing, Home
  • Regional data: North, South, East, West regions
  • Extended time periods: 14-day forecasts
  • Rich metadata: Prices, categories, regions

Example Queries

Data Analysis

  • "Explain the sales forecast for Product1 at StoreA"
  • "What anomalies are detected in the data?"
  • "Compare sales performance between stores"
  • "Give me inventory recommendations"

Knowledge Base

  • "What are best practices for inventory management?"
  • "How do I calculate safety stock?"
  • "What KPIs should I track for retail?"
  • "Explain seasonal patterns in retail sales"

System Information

Use the "Data Summary" and "Vector Store Info" buttons to view:

  • Dataset statistics and metadata
  • Vector store collection information
  • Embedding model details
  • Document chunk counts

Performance

  • Offline LLM: No internet required for AI responses
  • Fast retrieval: Vector store with optimized embeddings
  • Scalable: Handles large datasets efficiently
  • Persistent: Saves conversation history and vector store

Future Enhancements

  • Real-time data integration
  • Advanced anomaly detection algorithms
  • Multi-language support
  • API endpoints for external systems
  • Advanced visualization options