Spaces:

sujana05
/

DataSprint

Sleeping

App Files Files Community

DataSprint / README.md

sujana05

Upload folder using huggingface_hub

b4ce589 verified 8 months ago

preview code

raw

history blame contribute delete

4.76 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

title: DataSprint
app_file: fleet_optimizer.py
sdk: gradio
sdk_version: 4.44.1

Demand Forecasting for Retail

This project predicts future sales per product, flags anomalies, and recommends optimal inventory levels for retailers. It features an interactive Gradio interface with chart visualizations, an offline LLM-powered chat (Ollama via LangChain), and vector store document retrieval capabilities.

Features

Time series forecasting (Prophet) with confidence intervals
Anomaly detection using statistical methods
Inventory recommendations with safety stock calculations
Interactive Plotly charts with enhanced visualizations
LLM-powered chat (offline via Ollama) with conversation memory
Vector store retrieval for retail knowledge base access
Large dataset support for realistic testing scenarios
Advanced LangChain features including chains, memory, and retrieval QA

Setup

Install dependencies:
```
pip install -r requirements.txt
```
Install and run Ollama (https://ollama.com/):
```
ollama run mistral
```
Run the app:
```
python app.py
```

Data

Small dataset: data/sales.csv (10 records for quick testing)
Large dataset: data/sales_large.csv (84 records with categories, regions, prices)
Knowledge base: data/retail_documents.txt (comprehensive retail analytics guide)

Usage

Basic Analytics

Select store and product from dropdowns
View interactive forecast charts with confidence intervals
Analyze anomalies and inventory recommendations

AI Chat Features

General questions: "Explain the forecast", "What anomalies do you see?"
Knowledge base queries: Check "Use Knowledge Base" for best practices
Comparison analysis: "Compare StoreA vs StoreB"
Business insights: "What are the key trends?"

Knowledge Base Access

The system includes a comprehensive retail knowledge base covering:

Sales forecasting best practices
Inventory management guidelines
Retail metrics and KPIs
Seasonal patterns by category
Anomaly detection methods
Business intelligence insights

Vector Store Features

Document retrieval: Search retail knowledge base
Context-aware responses: AI uses relevant documents for answers
Persistent storage: ChromaDB vector store with sentence transformers
Source attribution: Responses include source document information

Tech Stack

Python: Core language
Prophet: Time series forecasting
Plotly: Interactive visualizations
Gradio: Web interface
LangChain: LLM orchestration and chains
Ollama: Offline LLM (Mistral)
ChromaDB: Vector store for document retrieval
Sentence Transformers: Document embeddings

Advanced Features

LangChain Integration

Conversation Memory: Remembers chat history
Custom Chains: RetailAnalysisChain, SalesComparisonChain
Retrieval QA: Knowledge base question answering
Prompt Templates: Structured, reusable prompts
Streaming Responses: Real-time AI output

Vector Store Capabilities

Semantic Search: Find relevant retail knowledge
Document Chunking: Intelligent text splitting
Embedding Models: Sentence transformers for document encoding
Similarity Search: Retrieve contextually relevant information

Large Dataset Testing

Multiple stores: StoreA, StoreB, StoreC, StoreD
Product categories: Electronics, Clothing, Home
Regional data: North, South, East, West regions
Extended time periods: 14-day forecasts
Rich metadata: Prices, categories, regions

Example Queries

Data Analysis

"Explain the sales forecast for Product1 at StoreA"
"What anomalies are detected in the data?"
"Compare sales performance between stores"
"Give me inventory recommendations"

Knowledge Base

"What are best practices for inventory management?"
"How do I calculate safety stock?"
"What KPIs should I track for retail?"
"Explain seasonal patterns in retail sales"

System Information

Use the "Data Summary" and "Vector Store Info" buttons to view:

Dataset statistics and metadata
Vector store collection information
Embedding model details
Document chunk counts

Performance

Offline LLM: No internet required for AI responses
Fast retrieval: Vector store with optimized embeddings
Scalable: Handles large datasets efficiently
Persistent: Saves conversation history and vector store

Future Enhancements

Real-time data integration
Advanced anomaly detection algorithms
Multi-language support
API endpoints for external systems
Advanced visualization options