A newer version of the Streamlit SDK is available:
1.52.2
AI-Powered Excel Data Analysis App
A Streamlit application that automates Excel data processing, provides intelligent analysis using Google's Gemini AI, and offers interactive visualizations. Perfect for analyzing EOC (Emergency Operations Center) data with automated designation-to-cadre mapping.
Features
File Upload & Processing
- Supports CSV, XLS, XLSX formats
- Automatic data cleaning
- Smart designation to cadre mapping
- Handles multi-level headers
Interactive Data Preview
- Column selection
- Global search functionality
- Advanced column-specific filters
- Customizable row display
- Hide/show index options
AI-Powered Analysis
- Intelligent data insights using Gemini AI
- Natural language queries
- Automated data summaries
- Pattern recognition
- Follow-up question suggestions
Data Visualization
- Dynamic charts and graphs
- Cadre distribution analysis
- District-wise visualizations
- Interactive dashboards
- Correlation analysis
Setup & Installation
Clone the repository
git clone https://github.com/HussainM899/AI-Data-Processing-Analytics.git cd AI-Data-Processing-AnalyticsCreate and activate virtual environment
python -m venv venv source venv/bin/activate # For Linux/Mac venv\Scripts\activate # For WindowsInstall dependencies
pip install -r requirements.txtSet up environment variables
- Create a
.envfile in the root directory - Add required credentials (see
.env.example)
- Create a
Required Environment Variables
env
GOOGLE_APPLICATION_CREDENTIALS=path/to/credentials.json
GOOGLE_API_KEY=your_api_key_here
Usage
Start the application
streamlit run app.pyUpload Data
- Use the file uploader to import your Excel/CSV file
- The app automatically processes and cleans the data
- Multi-level headers are automatically handled
Analyze Data
- Use the navigation sidebar to switch between modes:
- Data Processing
- Analysis & Visualization
- About
- Ask questions in natural language
- View automated insights and visualizations
- Use the navigation sidebar to switch between modes:
Export Results
- Download processed data in Excel format
- Export updated designation mappings
- Save analysis reports
Project Structure
AI-Data-Processing-Analytics/
βββ app.py # Main application file
βββ requirements.txt # Project dependencies
βββ .env.example # Example environment variables
βββ .gitignore # Git ignore rules
βββ README.md # Project documentation
Dependencies
streamlit: Web application frameworkpandas: Data manipulation and analysisplotly: Interactive visualizationsgoogle-generativeai: Gemini AI integrationlangchain-google-genai: LangChain integrationpython-dotenv: Environment variable managementopenpyxl: Excel file handling
Security Notes
- Never commit sensitive credentials
- Use environment variables for API keys
- Keep service account JSON file secure
- Regularly rotate credentials
- Avoid sharing API keys publicly
Features in Detail
Data Processing
- Automatic cleaning of data
- Handling of missing values
- Removal of duplicates
- Smart string cleaning
- Multi-level header handling
AI Analysis
- District-wise analysis
- Cadre distribution insights
- Trend identification
- Anomaly detection
- Custom query handling
Visualization
- Pie charts for distributions
- Bar charts for comparisons
- Histograms for numerical data
- Correlation matrices
- Interactive filters
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contact
Hussain - hussainmurtaza899@gmail.com Project Link: https://github.com/HussainM899/AI-Data-Processing-Analytics
Built using Streamlit and Gemini AI