BI-dashboard / README.md
Lohith Venkat Chamakura
Update README.md
6655ae0

A newer version of the Gradio SDK is available: 6.13.0

Upgrade
metadata
title: Business Intelligence Dashboard
emoji: πŸ“Š
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.0.2
app_file: app.py
pinned: false

Business Intelligence Dashboard

An interactive Business Intelligence dashboard built with Gradio that enables users to explore and analyze business data through an intuitive, Tableau-like web interface.

Features

πŸ“ Data Upload & Validation

  • Upload CSV or Excel files through the web interface
  • Display basic dataset information (shape, columns, data types)
  • Show data preview (first 10 rows)
  • Graceful error handling with informative messages

πŸ“ˆ Data Exploration & Summary Statistics

  • Automated Data Profiling:
    • Numerical columns: mean, median, std, min, max, quartiles
    • Categorical columns: unique values, value counts, mode
    • Missing value report
    • Correlation matrix for numerical features

πŸ” Interactive Filtering

  • Dynamic filtering interface based on column types:
    • Numerical: Range sliders with min/max inputs
    • Categorical: Multi-select checkboxes
    • Date: Date range pickers (when applicable)
  • Real-time row count updates as filters are applied
  • Display filtered data preview

πŸ“Š Visualizations

Implements 5 different visualization types:

  1. Time Series Plot: Trends over time with aggregation options
  2. Distribution Plot: Histogram or box plot for numerical data
  3. Category Analysis: Bar chart or pie chart for categorical data
  4. Scatter Plot: Show relationships between variables
  5. Correlation Heatmap: Visualize correlations between numerical features

Features:

  • User selects which columns to visualize
  • Clear titles, labels, and legends
  • Multiple aggregation methods (sum, mean, count, median)
  • Professional Plotly visualizations

πŸ’‘ Insights Generation

Automatically generates insights:

  • Top/Bottom Performers: Identify highest/lowest values
  • Basic Trends: Detect patterns in time series data
  • Summary Statistics: High-level dataset overview

πŸ’Ύ Export Functionality

  • Export filtered data as CSV
  • Export visualizations as PNG images

High-Level Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         User Interface                          β”‚
β”‚                      (Gradio Web Interface)                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚  β”‚ Data Upload  β”‚  β”‚ Visualizationβ”‚  β”‚   Insights   β”‚           β”‚
β”‚  β”‚   & Preview  β”‚  β”‚   & Charts   β”‚  β”‚  Generation  β”‚           β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚  β”‚  Statistics  β”‚  β”‚   Filter &   β”‚  β”‚    Export    β”‚           β”‚
β”‚  β”‚  & Profiling β”‚  β”‚   Explore    β”‚  β”‚ Functionalityβ”‚           β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Application Layer (app.py)                   β”‚
β”‚  β€’ Orchestrates user interactions                               β”‚
β”‚  β€’ Manages global state (current_df, filters, figures)          β”‚
β”‚  β€’ Routes requests to appropriate modules                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                    β”‚                    β”‚
        β–Ό                    β–Ό                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Data Processing β”‚ β”‚  Visualizations  β”‚ β”‚     Insights     β”‚
β”‚   Layer          β”‚ β”‚     Layer        β”‚ β”‚      Layer       β”‚
β”‚                  β”‚ β”‚                  β”‚ β”‚                  β”‚
β”‚ data_processor.pyβ”‚ β”‚visualizations.py β”‚ β”‚  insights.py     β”‚
β”‚                  β”‚ β”‚                  β”‚ β”‚                  β”‚
β”‚ β€’ CSV/Excel Load β”‚ β”‚ β€’ Time Series    β”‚ β”‚ β€’ Top/Bottom     β”‚
β”‚ β€’ Data Cleaning  β”‚ β”‚ β€’ Distribution   β”‚ β”‚   Performers     β”‚
β”‚ β€’ Filtering      β”‚ β”‚ β€’ Category       β”‚ β”‚ β€’ Trend Analysis β”‚
β”‚ β€’ Statistics     β”‚ β”‚   Analysis       β”‚ β”‚ β€’ Summary Stats  β”‚
β”‚   Generation     β”‚ β”‚ β€’ Scatter Plot   β”‚ β”‚                  β”‚
β”‚                  β”‚ β”‚ β€’ Correlation    β”‚ β”‚                  β”‚
β”‚                  β”‚ β”‚   Heatmap        β”‚ β”‚                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚                    β”‚                    β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Utilities Layer (utils.py)                   β”‚
β”‚  β€’ Column type detection (numerical, categorical, date)         β”‚
β”‚  β€’ Missing value analysis                                       β”‚
β”‚  β€’ Data validation helpers                                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      Data Sources                               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚  β”‚ stocks.csv   β”‚  β”‚sales_train   β”‚  β”‚Online Retail β”‚           β”‚
β”‚  β”‚              β”‚  β”‚   .csv       β”‚  β”‚   .xlsx      β”‚           β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”‚                                                                 β”‚
β”‚  β€’ CSV files (pandas.read_csv)                                  β”‚
β”‚  β€’ Excel files (pandas.read_excel)                              β”‚
β”‚  β€’ User-uploaded datasets                                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    External Libraries                           β”‚
β”‚  β€’ pandas: Data manipulation and analysis                       β”‚
β”‚  β€’ plotly: Interactive visualizations                           β”‚
β”‚  β€’ gradio: Web interface framework                              β”‚
β”‚  β€’ numpy: Numerical computations                                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Project Structure

project/
β”œβ”€β”€ app.py                 # Main Gradio application
β”œβ”€β”€ data_processor.py      # Data loading, cleaning, filtering 
β”œβ”€β”€ visualizations.py      # Chart creation functions
β”œβ”€β”€ insights.py            # Automated insight generation
β”œβ”€β”€ utils.py               # Helper 
β”œβ”€β”€ constants.py           # Constants used throughout the code
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ README.md              # This file
└── data/                  # Sample datasets
    β”œβ”€β”€ sales_train.csv
    β”œβ”€β”€ stocks.csv
    └── Online Retail.xlsx

Setup Instructions

1. Install Dependencies

pip install -r requirements.txt

Note: This project uses Gradio 6.0.2, which includes improved performance and updated APIs. Make sure you have Python 3.8 or higher installed.

2. Run the Application

python app.py

The application will launch and be accessible at http://localhost:7860 in your web browser.

Usage

  1. Upload Data: Navigate to the "Data Upload & Preview" tab and upload a CSV or Excel file
  2. View Statistics: Go to "Statistics & Profiling" to see comprehensive data statistics
  3. Apply Filters: Use "Filter & Explore" to filter your data by column values
  4. Create Visualizations: Visit "Visualizations" to create interactive charts
  5. Generate Insights: Check "Insights" for automated data insights
  6. Export Data: Use "Export" to download filtered data or visualizations

Aggregation Methods

The dashboard supports multiple aggregation methods for visualizations:

  • Sum: Adds all values together (useful for totals, volumes)
  • Mean: Calculates the average value (useful for prices, rates)
  • Count: Counts the number of data points (useful for frequency)
  • Median: Finds the middle value (robust to outliers)
  • None: No aggregation (shows raw data points)

Step-by-Step Tutorial: Monthly Average Closing Price

Let's walk through a complete example:

Step 1: Load the Data

  1. Open the dashboard
  2. Go to πŸ“ Data Upload & Preview tab
  3. Click Upload Dataset
  4. Select sample-datasets/stocks.csv
  5. Click Load Data
  6. Verify the data preview shows the stock data

Step 2: Create the Visualization

  1. Navigate to πŸ“Š Visualizations tab
  2. Configure the chart:
    • Chart Type: Time Series
    • X-Axis Column: Date
    • Y-Axis Column: Close
    • Aggregation Method: Mean
  3. Click Generate Visualization

Step 3: Interpret the Results

  • The chart shows a line graph with dates on X-axis and average closing prices on Y-axis
  • Each point represents the mean closing price for that date
  • You can see trends, patterns, and changes over time

Step 4: Compare Different Aggregations

Try generating the same chart with different aggregation methods:

  • Mean: Average closing price (smooth trend)
  • Sum: Total closing price (not meaningful for prices, but shows concept)
  • Median: Middle closing price (robust to outliers)
  • None: All individual closing prices (may be cluttered)

Technical Details

Design Patterns

The application uses the Strategy Pattern for:

  • Data Loading: Different strategies for CSV vs Excel files
  • Data Filtering: Different strategies for numerical, categorical, and date filters
  • Visualizations: Different strategies for each chart type

Code Quality

  • Follows PEP 8 style guidelines
  • Comprehensive docstrings for all functions
  • Proper error handling with try/except blocks
  • Modular design with clear separation of concerns
  • No hardcoded values (uses constants and configuration)

Libraries

  • pandas 2.2.0+: All data manipulation and analysis
  • Gradio 6.0.2: Web interface framework
  • Plotly 5.22.0+: Interactive visualizations
  • matplotlib 3.8.0+ / seaborn 0.13.0+: Additional visualization support
  • Python 3.8+: Following best practices

Sample Datasets

The data/ folder includes sample datasets:

  • sales_train.csv: Sales transaction data
  • stocks.csv: Stock market data
  • Online Retail.xlsx: E-commerce retail data

Requirements

  • Python 3.8 or higher
  • All dependencies listed in requirements.txt:
    • pandas >= 2.2.0
    • numpy >= 1.26.0
    • gradio == 6.0.2
    • matplotlib >= 3.8.0
    • seaborn >= 0.13.0
    • plotly >= 5.22.0
    • kaleido >= 0.2.1
    • openpyxl >= 3.1.5
    • Pillow >= 10.4.0

License

This project is created for educational purposes as part of CS5130 coursework.