Delete README.md
Browse files
README.md
DELETED
|
@@ -1,144 +0,0 @@
|
|
| 1 |
-
---
|
| 2 |
-
title: AI-Powered EDA Feature Engineering Assistant
|
| 3 |
-
emoji: π
|
| 4 |
-
colorFrom: blue
|
| 5 |
-
colorTo: purple
|
| 6 |
-
sdk: streamlit
|
| 7 |
-
sdk_version: "1.30.0"
|
| 8 |
-
app_file: main.py
|
| 9 |
-
pinned: false
|
| 10 |
-
---
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
# AI-Powered EDA & Feature Engineering Assistant
|
| 14 |
-
|
| 15 |
-

|
| 16 |
-
|
| 17 |
-
An interactive application that uses AI to analyze datasets and provide comprehensive exploratory data analysis (EDA) insights and feature engineering recommendations.
|
| 18 |
-
|
| 19 |
-
## π Features
|
| 20 |
-
|
| 21 |
-
- **π€ AI-Powered Analysis**: Receive detailed EDA insights generated by Mistral-7B
|
| 22 |
-
- **π Automated Visualizations**: Generate key visualizations with a single click
|
| 23 |
-
- **π§ Feature Engineering Recommendations**: Get AI suggestions for improving your data
|
| 24 |
-
- **β οΈ Data Quality Assessment**: Identify issues in your dataset and receive fixing advice
|
| 25 |
-
- **π¬ Chat Interface**: Ask questions about your dataset and get AI-powered answers
|
| 26 |
-
- **π Dark Mode UI**: Sleek, modern dark interface for comfortable analysis
|
| 27 |
-
|
| 28 |
-
## π Demo
|
| 29 |
-
|
| 30 |
-
Here's a quick look at what you can do:
|
| 31 |
-
|
| 32 |
-
1. Upload a CSV dataset
|
| 33 |
-
2. Get automatic visualizations and statistics
|
| 34 |
-
3. Generate AI-powered insights for:
|
| 35 |
-
- Exploratory Data Analysis
|
| 36 |
-
- Feature Engineering Recommendations
|
| 37 |
-
- Data Quality Assessment
|
| 38 |
-
4. Chat with your data to ask specific questions
|
| 39 |
-
|
| 40 |
-
## π οΈ Tech Stack
|
| 41 |
-
|
| 42 |
-
- **Frontend**: Streamlit
|
| 43 |
-
- **Data Processing**: Pandas, NumPy, Matplotlib, Seaborn
|
| 44 |
-
- **AI Integration**: LangChain + Groq API
|
| 45 |
-
- **LLM Model**: Llama3-8b-8192
|
| 46 |
-
|
| 47 |
-
## π¦ Installation
|
| 48 |
-
|
| 49 |
-
### Prerequisites
|
| 50 |
-
- Python 3.8+
|
| 51 |
-
- Anaconda or Miniconda (recommended)
|
| 52 |
-
- Groq API key
|
| 53 |
-
|
| 54 |
-
### Setup
|
| 55 |
-
|
| 56 |
-
1. Clone the repository:
|
| 57 |
-
```bash
|
| 58 |
-
git clone https://github.com/vashu2425/AI-Powered-EDA-Feature-Engineering-Assistant.git
|
| 59 |
-
cd AI-Powered-EDA-Feature-Engineering-Assistant
|
| 60 |
-
```
|
| 61 |
-
|
| 62 |
-
2. Create and activate a conda environment:
|
| 63 |
-
```bash
|
| 64 |
-
conda create -n ai_eda_env python=3.10
|
| 65 |
-
conda activate ai_eda_env
|
| 66 |
-
```
|
| 67 |
-
|
| 68 |
-
3. Install the required packages:
|
| 69 |
-
```bash
|
| 70 |
-
pip install -r requirements.txt
|
| 71 |
-
```
|
| 72 |
-
|
| 73 |
-
4. Create a `.env` file with your Groq API key:
|
| 74 |
-
```
|
| 75 |
-
GROQ_API_KEY=your_groq_api_key_here
|
| 76 |
-
```
|
| 77 |
-
|
| 78 |
-
### β οΈ Compatibility Note
|
| 79 |
-
|
| 80 |
-
This application requires specific versions of NumPy (1.24.3) and pandas (1.5.3) to avoid binary compatibility issues. The requirements.txt file has been updated with these specific versions to ensure a smooth installation experience.
|
| 81 |
-
|
| 82 |
-
### π§ Troubleshooting
|
| 83 |
-
|
| 84 |
-
If you encounter the following error:
|
| 85 |
-
```
|
| 86 |
-
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
|
| 87 |
-
```
|
| 88 |
-
|
| 89 |
-
Try the following solutions:
|
| 90 |
-
|
| 91 |
-
1. Make sure you're using the exact versions specified in requirements.txt:
|
| 92 |
-
```bash
|
| 93 |
-
pip install numpy==1.24.3 pandas==1.5.3
|
| 94 |
-
```
|
| 95 |
-
|
| 96 |
-
2. If you're using Streamlit version older than 1.11.0, you might need to update the code to replace `st.experimental_rerun()` with `st.rerun()`.
|
| 97 |
-
|
| 98 |
-
3. If you're still having issues, try creating a fresh conda environment with Python 3.10:
|
| 99 |
-
```bash
|
| 100 |
-
conda create -n fresh_ai_eda_env python=3.10
|
| 101 |
-
conda activate fresh_ai_eda_env
|
| 102 |
-
pip install -r requirements.txt
|
| 103 |
-
```
|
| 104 |
-
|
| 105 |
-
## π Usage
|
| 106 |
-
|
| 107 |
-
1. Activate the conda environment:
|
| 108 |
-
```bash
|
| 109 |
-
conda activate ai_eda_env
|
| 110 |
-
```
|
| 111 |
-
|
| 112 |
-
2. Run the application:
|
| 113 |
-
```bash
|
| 114 |
-
streamlit run main.py
|
| 115 |
-
```
|
| 116 |
-
|
| 117 |
-
3. Open your web browser and navigate to `http://localhost:8501`
|
| 118 |
-
|
| 119 |
-
4. Upload a CSV dataset and start exploring!
|
| 120 |
-
|
| 121 |
-
## π Example Analysis
|
| 122 |
-
|
| 123 |
-
Here are some examples of insights you can get:
|
| 124 |
-
|
| 125 |
-
- Comprehensive EDA insights about your dataset variables and distributions
|
| 126 |
-
- Feature engineering ideas specific to your data
|
| 127 |
-
- Data quality improvement recommendations
|
| 128 |
-
- Visualizations including correlation heatmaps, distribution plots, and more
|
| 129 |
-
|
| 130 |
-
## π€ Contributing
|
| 131 |
-
|
| 132 |
-
Contributions are welcome! Please feel free to submit a Pull Request.
|
| 133 |
-
|
| 134 |
-
## π License
|
| 135 |
-
|
| 136 |
-
This project is licensed under the MIT License - see the LICENSE file for details.
|
| 137 |
-
|
| 138 |
-
## π¬ Contact
|
| 139 |
-
|
| 140 |
-
For any questions or feedback, please reach out to the repository owner.
|
| 141 |
-
|
| 142 |
-
---
|
| 143 |
-
|
| 144 |
-
### π Star this repository if you find it useful!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|