--- title: Business Intelligence Dashboard emoji: 📊 colorFrom: blue colorTo: indigo sdk: gradio sdk_version: 6.0.2 app_file: app.py pinned: false license: mit --- # Business Intelligence Dashboard A professional, interactive Business Intelligence dashboard built with Gradio that enables non-technical stakeholders to explore and analyze business data through an intuitive web interface. ## 🌟 Features ### Data Upload & Validation - Support for CSV and Excel files (.xlsx, .xls) - Automatic data type detection - Data preview and basic information display - Comprehensive error handling ### Data Exploration & Summary Statistics - Automated data profiling for numerical and categorical columns - Missing value analysis - Correlation matrix for numerical features - Descriptive statistics (mean, median, std, quartiles) ### Interactive Filtering - **Numerical filters**: Range sliders with min/max inputs - **Categorical filters**: Multi-select dropdowns - **Date filters**: Date range selection - Real-time row count updates - Filtered data preview ### Visualizations (7 Types) 1. **Time Series Plot**: Trends over time with multiple aggregation methods 2. **Distribution (Histogram)**: Data distribution with mean/median lines 3. **Distribution (Box Plot)**: Statistical distribution with quartiles 4. **Bar Chart**: Category analysis with top N selection 5. **Pie Chart**: Proportional category breakdown 6. **Scatter Plot**: Relationship between variables with trend lines 7. **Correlation Heatmap**: Numerical feature correlations All visualizations support: - User-selectable columns - Multiple aggregation methods (sum, mean, count, median) - Professional styling with clear labels and legends ### Automated Insights - **Top/Bottom Performers**: Identify highest/lowest values - **Trends & Patterns**: Detect increasing/decreasing trends - **Anomaly Detection**: Find outliers using statistical methods - **Data Quality Assessment**: Completeness, uniqueness, consistency checks ### Export Functionality - Export filtered data as CSV - Export visualizations as high-resolution PNG images ## 📋 Requirements ``` gradio>=6.0.2 pandas>=2.0.0 matplotlib>=3.7.0 seaborn>=0.12.0 plotly>=5.14.0 openpyxl>=3.1.0 numpy>=1.24.0 scipy>=1.10.0 ``` ## 🚀 Installation 1. **Clone or download this repository** 2. **Install dependencies**: ```bash pip install -r requirements.txt ``` ## 💻 Usage ### Running the Application ```bash python app.py ``` The dashboard will launch at `http://127.0.0.1:7860` ### Using the Dashboard 1. **Upload Data**: - Navigate to the "Data Upload" tab - Click "Upload File" and select a CSV or Excel file - Click "Load Data" to process the file 2. **View Statistics**: - Go to the "Statistics" tab - Click "Generate Statistics" to see comprehensive data analysis 3. **Filter Data**: - Use the "Filter & Explore" tab - Select columns and apply filters - View filtered data preview and row counts 4. **Create Visualizations**: - Navigate to "Visualizations" tab - Select visualization type - Choose columns and aggregation methods - Click "Create Visualization" 5. **Generate Insights**: - Go to "Insights" tab - Click "Generate Insights" for automated analysis 6. **Export Results**: - Use the "Export" tab to download filtered data - Export visualizations from the Visualizations tab ## 📁 Project Structure ``` APDP_DASHBOARD/ │ ├── app.py # Main Gradio application ├── data_processor.py # Data loading, cleaning, filtering ├── visualizations.py # Chart creation functions ├── insights.py # Automated insight generation ├── utils.py # Helper functions ├── requirements.txt # Dependencies ├── README.md # This file └── data/ # Sample datasets ├── sample_sales_data.csv └── sample_employee_data.csv ``` ## 📊 Sample Datasets ### 1. Sales Data (`sample_sales_data.csv`) - **Columns**: Date, Product, Category, Region, Sales, Units, Customer_Satisfaction - **Rows**: 75 records - **Use Case**: Sales analysis, time series trends, regional performance ### 2. Employee Data (`sample_employee_data.csv`) - **Columns**: Employee_ID, Name, Department, Position, Salary, Years_Experience, Performance_Score, Projects_Completed, Training_Hours - **Rows**: 50 records - **Use Case**: HR analytics, salary analysis, performance evaluation ## 🔧 Code Architecture ![System Architecture](assets/architecture.svg) ### Modular Design - **`utils.py`**: Utility functions for data formatting, validation, and type detection - **`data_processor.py`**: Core data operations (loading, cleaning, filtering, aggregation) - **`visualizations.py`**: Interactive Plotly-based chart creation with hover details - **`insights.py`**: Advanced AI-driven insights, Smart Dashboard logic, and Comparison tools - **`app.py`**: Gradio interface orchestrating all components ### Key Design Principles 1. **Separation of Concerns**: Each module has a specific responsibility 2. **Error Handling**: Comprehensive try-except blocks with user-friendly messages 3. **Type Hints**: All functions include type annotations 4. **Documentation**: Detailed docstrings for all functions 5. **PEP 8 Compliance**: Follows Python style guidelines ## 🎯 Features Checklist - ✅ Data Upload & Validation (CSV/Excel support) - ✅ Data Preview (first/last N rows) - ✅ Summary Statistics (numerical & categorical) - ✅ Missing Value Analysis - ✅ Correlation Matrix - ✅ Interactive Filtering (numerical, categorical, datetime) - ✅ 7 Visualization Types - ✅ Aggregation Methods (sum, mean, count, median) - ✅ Automated Insights Generation - ✅ Top/Bottom Performers - ✅ Trend Detection - ✅ Anomaly Detection - ✅ Export Filtered Data (CSV) - ✅ Export Visualizations (PNG) ## 🐛 Error Handling The application handles: - Invalid file formats - Missing or corrupted data - Empty datasets - Invalid column selections - Type mismatches - Missing values in calculations All errors display user-friendly messages in the interface. ## 🔍 Technical Highlights ### Data Processing - Automatic datetime detection and conversion - Flexible filtering system supporting multiple data types - Memory-efficient operations for large datasets ### Visualizations - Interactive Plotly charts with hover tooltips - Automatic color schemes and responsive layout - Statistical annotations (mean, median, trend lines) - Support for grouped and aggregated data ### Insights Engine - Z-score based anomaly detection - Time series trend analysis - Categorical pattern recognition - Data quality scoring ## 📝 Best Practices 1. **Data Quality**: Clean your data before uploading for best results 2. **File Size**: For large files (>100MB), consider filtering in Excel first 3. **Datetime Columns**: Use standard formats (YYYY-MM-DD) for automatic detection 4. **Missing Values**: The app handles them gracefully, but review the missing value report ## 🤝 Contributing This project follows PEP 8 style guidelines. When contributing: - Add docstrings to all functions - Include type hints - Handle errors gracefully - Write modular, reusable code ## 📄 License This project is created for educational purposes as part of a data science course. ## 🙋 Support For issues or questions: 1. Check the error messages in the interface 2. Review the sample datasets for format examples 3. Ensure all dependencies are installed correctly ## 🎓 Learning Outcomes This project demonstrates: - Data manipulation with pandas - Interactive web applications with Gradio - Data visualization best practices - Statistical analysis and insight generation - Clean, modular Python code architecture - Error handling and user experience design --- **Built with ❤️ using Python, Gradio, and pandas**