Spaces:

Saumith
/

Business_Intelligence_Dashboard

Sleeping

App Files Files Community

Business_Intelligence_Dashboard / README.md

Saumith

Update README.md

bded8d9 verified 3 months ago

preview code

raw

history blame contribute delete

7.94 kB

	---
	title: Business Intelligence Dashboard
	emoji: 📊
	colorFrom: blue
	colorTo: indigo
	sdk: gradio
	sdk_version: 6.0.2
	app_file: app.py
	pinned: false
	license: mit
	---

	# Business Intelligence Dashboard

	A professional, interactive Business Intelligence dashboard built with Gradio that enables non-technical stakeholders to explore and analyze business data through an intuitive web interface.

	## 🌟 Features

	### Data Upload & Validation
	- Support for CSV and Excel files (.xlsx, .xls)
	- Automatic data type detection
	- Data preview and basic information display
	- Comprehensive error handling

	### Data Exploration & Summary Statistics
	- Automated data profiling for numerical and categorical columns
	- Missing value analysis
	- Correlation matrix for numerical features
	- Descriptive statistics (mean, median, std, quartiles)

	### Interactive Filtering
	- Numerical filters: Range sliders with min/max inputs
	- Categorical filters: Multi-select dropdowns
	- Date filters: Date range selection
	- Real-time row count updates
	- Filtered data preview

	### Visualizations (7 Types)
	1. Time Series Plot: Trends over time with multiple aggregation methods
	2. Distribution (Histogram): Data distribution with mean/median lines
	3. Distribution (Box Plot): Statistical distribution with quartiles
	4. Bar Chart: Category analysis with top N selection
	5. Pie Chart: Proportional category breakdown
	6. Scatter Plot: Relationship between variables with trend lines
	7. Correlation Heatmap: Numerical feature correlations

	All visualizations support:
	- User-selectable columns
	- Multiple aggregation methods (sum, mean, count, median)
	- Professional styling with clear labels and legends

	### Automated Insights
	- Top/Bottom Performers: Identify highest/lowest values
	- Trends & Patterns: Detect increasing/decreasing trends
	- Anomaly Detection: Find outliers using statistical methods
	- Data Quality Assessment: Completeness, uniqueness, consistency checks

	### Export Functionality
	- Export filtered data as CSV
	- Export visualizations as high-resolution PNG images

	## 📋 Requirements

	```
	gradio>=6.0.2
	pandas>=2.0.0
	matplotlib>=3.7.0
	seaborn>=0.12.0
	plotly>=5.14.0
	openpyxl>=3.1.0
	numpy>=1.24.0
	scipy>=1.10.0
	```

	## 🚀 Installation

	1. Clone or download this repository

	2. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	## 💻 Usage

	### Running the Application

	```bash
	python app.py
	```

	The dashboard will launch at `http://127.0.0.1:7860`

	### Using the Dashboard

	1. Upload Data:
	- Navigate to the "Data Upload" tab
	- Click "Upload File" and select a CSV or Excel file
	- Click "Load Data" to process the file

	2. View Statistics:
	- Go to the "Statistics" tab
	- Click "Generate Statistics" to see comprehensive data analysis

	3. Filter Data:
	- Use the "Filter & Explore" tab
	- Select columns and apply filters
	- View filtered data preview and row counts

	4. Create Visualizations:
	- Navigate to "Visualizations" tab
	- Select visualization type
	- Choose columns and aggregation methods
	- Click "Create Visualization"

	5. Generate Insights:
	- Go to "Insights" tab
	- Click "Generate Insights" for automated analysis

	6. Export Results:
	- Use the "Export" tab to download filtered data
	- Export visualizations from the Visualizations tab

	## 📁 Project Structure

	```
	APDP_DASHBOARD/
	│
	├── app.py # Main Gradio application
	├── data_processor.py # Data loading, cleaning, filtering
	├── visualizations.py # Chart creation functions
	├── insights.py # Automated insight generation
	├── utils.py # Helper functions
	├── requirements.txt # Dependencies
	├── README.md # This file
	└── data/ # Sample datasets
	├── sample_sales_data.csv
	└── sample_employee_data.csv
	```

	## 📊 Sample Datasets

	### 1. Sales Data (`sample_sales_data.csv`)
	- Columns: Date, Product, Category, Region, Sales, Units, Customer_Satisfaction
	- Rows: 75 records
	- Use Case: Sales analysis, time series trends, regional performance

	### 2. Employee Data (`sample_employee_data.csv`)
	- Columns: Employee_ID, Name, Department, Position, Salary, Years_Experience, Performance_Score, Projects_Completed, Training_Hours
	- Rows: 50 records
	- Use Case: HR analytics, salary analysis, performance evaluation

	## 🔧 Code Architecture

	![System Architecture](assets/architecture.svg)

	### Modular Design

	- `utils.py`: Utility functions for data formatting, validation, and type detection
	- `data_processor.py`: Core data operations (loading, cleaning, filtering, aggregation)
	- `visualizations.py`: Interactive Plotly-based chart creation with hover details
	- `insights.py`: Advanced AI-driven insights, Smart Dashboard logic, and Comparison tools
	- `app.py`: Gradio interface orchestrating all components

	### Key Design Principles

	1. Separation of Concerns: Each module has a specific responsibility
	2. Error Handling: Comprehensive try-except blocks with user-friendly messages
	3. Type Hints: All functions include type annotations
	4. Documentation: Detailed docstrings for all functions
	5. PEP 8 Compliance: Follows Python style guidelines

	## 🎯 Features Checklist

	- ✅ Data Upload & Validation (CSV/Excel support)
	- ✅ Data Preview (first/last N rows)
	- ✅ Summary Statistics (numerical & categorical)
	- ✅ Missing Value Analysis
	- ✅ Correlation Matrix
	- ✅ Interactive Filtering (numerical, categorical, datetime)
	- ✅ 7 Visualization Types
	- ✅ Aggregation Methods (sum, mean, count, median)
	- ✅ Automated Insights Generation
	- ✅ Top/Bottom Performers
	- ✅ Trend Detection
	- ✅ Anomaly Detection
	- ✅ Export Filtered Data (CSV)
	- ✅ Export Visualizations (PNG)

	## 🐛 Error Handling

	The application handles:
	- Invalid file formats
	- Missing or corrupted data
	- Empty datasets
	- Invalid column selections
	- Type mismatches
	- Missing values in calculations

	All errors display user-friendly messages in the interface.

	## 🔍 Technical Highlights

	### Data Processing
	- Automatic datetime detection and conversion
	- Flexible filtering system supporting multiple data types
	- Memory-efficient operations for large datasets

	### Visualizations
	- Interactive Plotly charts with hover tooltips
	- Automatic color schemes and responsive layout
	- Statistical annotations (mean, median, trend lines)
	- Support for grouped and aggregated data

	### Insights Engine
	- Z-score based anomaly detection
	- Time series trend analysis
	- Categorical pattern recognition
	- Data quality scoring

	## 📝 Best Practices

	1. Data Quality: Clean your data before uploading for best results
	2. File Size: For large files (>100MB), consider filtering in Excel first
	3. Datetime Columns: Use standard formats (YYYY-MM-DD) for automatic detection
	4. Missing Values: The app handles them gracefully, but review the missing value report

	## 🤝 Contributing

	This project follows PEP 8 style guidelines. When contributing:
	- Add docstrings to all functions
	- Include type hints
	- Handle errors gracefully
	- Write modular, reusable code

	## 📄 License

	This project is created for educational purposes as part of a data science course.

	## 🙋 Support

	For issues or questions:
	1. Check the error messages in the interface
	2. Review the sample datasets for format examples
	3. Ensure all dependencies are installed correctly

	## 🎓 Learning Outcomes

	This project demonstrates:
	- Data manipulation with pandas
	- Interactive web applications with Gradio
	- Data visualization best practices
	- Statistical analysis and insight generation
	- Clean, modular Python code architecture
	- Error handling and user experience design

	---

	Built with ❤️ using Python, Gradio, and pandas