Saumith's picture
Update README.md
bded8d9 verified
---
title: Business Intelligence Dashboard
emoji: πŸ“Š
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.0.2
app_file: app.py
pinned: false
license: mit
---
# Business Intelligence Dashboard
A professional, interactive Business Intelligence dashboard built with Gradio that enables non-technical stakeholders to explore and analyze business data through an intuitive web interface.
## 🌟 Features
### Data Upload & Validation
- Support for CSV and Excel files (.xlsx, .xls)
- Automatic data type detection
- Data preview and basic information display
- Comprehensive error handling
### Data Exploration & Summary Statistics
- Automated data profiling for numerical and categorical columns
- Missing value analysis
- Correlation matrix for numerical features
- Descriptive statistics (mean, median, std, quartiles)
### Interactive Filtering
- **Numerical filters**: Range sliders with min/max inputs
- **Categorical filters**: Multi-select dropdowns
- **Date filters**: Date range selection
- Real-time row count updates
- Filtered data preview
### Visualizations (7 Types)
1. **Time Series Plot**: Trends over time with multiple aggregation methods
2. **Distribution (Histogram)**: Data distribution with mean/median lines
3. **Distribution (Box Plot)**: Statistical distribution with quartiles
4. **Bar Chart**: Category analysis with top N selection
5. **Pie Chart**: Proportional category breakdown
6. **Scatter Plot**: Relationship between variables with trend lines
7. **Correlation Heatmap**: Numerical feature correlations
All visualizations support:
- User-selectable columns
- Multiple aggregation methods (sum, mean, count, median)
- Professional styling with clear labels and legends
### Automated Insights
- **Top/Bottom Performers**: Identify highest/lowest values
- **Trends & Patterns**: Detect increasing/decreasing trends
- **Anomaly Detection**: Find outliers using statistical methods
- **Data Quality Assessment**: Completeness, uniqueness, consistency checks
### Export Functionality
- Export filtered data as CSV
- Export visualizations as high-resolution PNG images
## πŸ“‹ Requirements
```
gradio>=6.0.2
pandas>=2.0.0
matplotlib>=3.7.0
seaborn>=0.12.0
plotly>=5.14.0
openpyxl>=3.1.0
numpy>=1.24.0
scipy>=1.10.0
```
## πŸš€ Installation
1. **Clone or download this repository**
2. **Install dependencies**:
```bash
pip install -r requirements.txt
```
## πŸ’» Usage
### Running the Application
```bash
python app.py
```
The dashboard will launch at `http://127.0.0.1:7860`
### Using the Dashboard
1. **Upload Data**:
- Navigate to the "Data Upload" tab
- Click "Upload File" and select a CSV or Excel file
- Click "Load Data" to process the file
2. **View Statistics**:
- Go to the "Statistics" tab
- Click "Generate Statistics" to see comprehensive data analysis
3. **Filter Data**:
- Use the "Filter & Explore" tab
- Select columns and apply filters
- View filtered data preview and row counts
4. **Create Visualizations**:
- Navigate to "Visualizations" tab
- Select visualization type
- Choose columns and aggregation methods
- Click "Create Visualization"
5. **Generate Insights**:
- Go to "Insights" tab
- Click "Generate Insights" for automated analysis
6. **Export Results**:
- Use the "Export" tab to download filtered data
- Export visualizations from the Visualizations tab
## πŸ“ Project Structure
```
APDP_DASHBOARD/
β”‚
β”œβ”€β”€ app.py # Main Gradio application
β”œβ”€β”€ data_processor.py # Data loading, cleaning, filtering
β”œβ”€β”€ visualizations.py # Chart creation functions
β”œβ”€β”€ insights.py # Automated insight generation
β”œβ”€β”€ utils.py # Helper functions
β”œβ”€β”€ requirements.txt # Dependencies
β”œβ”€β”€ README.md # This file
└── data/ # Sample datasets
β”œβ”€β”€ sample_sales_data.csv
└── sample_employee_data.csv
```
## πŸ“Š Sample Datasets
### 1. Sales Data (`sample_sales_data.csv`)
- **Columns**: Date, Product, Category, Region, Sales, Units, Customer_Satisfaction
- **Rows**: 75 records
- **Use Case**: Sales analysis, time series trends, regional performance
### 2. Employee Data (`sample_employee_data.csv`)
- **Columns**: Employee_ID, Name, Department, Position, Salary, Years_Experience, Performance_Score, Projects_Completed, Training_Hours
- **Rows**: 50 records
- **Use Case**: HR analytics, salary analysis, performance evaluation
## πŸ”§ Code Architecture
![System Architecture](assets/architecture.svg)
### Modular Design
- **`utils.py`**: Utility functions for data formatting, validation, and type detection
- **`data_processor.py`**: Core data operations (loading, cleaning, filtering, aggregation)
- **`visualizations.py`**: Interactive Plotly-based chart creation with hover details
- **`insights.py`**: Advanced AI-driven insights, Smart Dashboard logic, and Comparison tools
- **`app.py`**: Gradio interface orchestrating all components
### Key Design Principles
1. **Separation of Concerns**: Each module has a specific responsibility
2. **Error Handling**: Comprehensive try-except blocks with user-friendly messages
3. **Type Hints**: All functions include type annotations
4. **Documentation**: Detailed docstrings for all functions
5. **PEP 8 Compliance**: Follows Python style guidelines
## 🎯 Features Checklist
- βœ… Data Upload & Validation (CSV/Excel support)
- βœ… Data Preview (first/last N rows)
- βœ… Summary Statistics (numerical & categorical)
- βœ… Missing Value Analysis
- βœ… Correlation Matrix
- βœ… Interactive Filtering (numerical, categorical, datetime)
- βœ… 7 Visualization Types
- βœ… Aggregation Methods (sum, mean, count, median)
- βœ… Automated Insights Generation
- βœ… Top/Bottom Performers
- βœ… Trend Detection
- βœ… Anomaly Detection
- βœ… Export Filtered Data (CSV)
- βœ… Export Visualizations (PNG)
## πŸ› Error Handling
The application handles:
- Invalid file formats
- Missing or corrupted data
- Empty datasets
- Invalid column selections
- Type mismatches
- Missing values in calculations
All errors display user-friendly messages in the interface.
## πŸ” Technical Highlights
### Data Processing
- Automatic datetime detection and conversion
- Flexible filtering system supporting multiple data types
- Memory-efficient operations for large datasets
### Visualizations
- Interactive Plotly charts with hover tooltips
- Automatic color schemes and responsive layout
- Statistical annotations (mean, median, trend lines)
- Support for grouped and aggregated data
### Insights Engine
- Z-score based anomaly detection
- Time series trend analysis
- Categorical pattern recognition
- Data quality scoring
## πŸ“ Best Practices
1. **Data Quality**: Clean your data before uploading for best results
2. **File Size**: For large files (>100MB), consider filtering in Excel first
3. **Datetime Columns**: Use standard formats (YYYY-MM-DD) for automatic detection
4. **Missing Values**: The app handles them gracefully, but review the missing value report
## 🀝 Contributing
This project follows PEP 8 style guidelines. When contributing:
- Add docstrings to all functions
- Include type hints
- Handle errors gracefully
- Write modular, reusable code
## πŸ“„ License
This project is created for educational purposes as part of a data science course.
## πŸ™‹ Support
For issues or questions:
1. Check the error messages in the interface
2. Review the sample datasets for format examples
3. Ensure all dependencies are installed correctly
## πŸŽ“ Learning Outcomes
This project demonstrates:
- Data manipulation with pandas
- Interactive web applications with Gradio
- Data visualization best practices
- Statistical analysis and insight generation
- Clean, modular Python code architecture
- Error handling and user experience design
---
**Built with ❀️ using Python, Gradio, and pandas**