vashu2425 commited on
Commit
543793f
Β·
verified Β·
1 Parent(s): fc478c4

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -144
README.md DELETED
@@ -1,144 +0,0 @@
1
- ---
2
- title: AI-Powered EDA Feature Engineering Assistant
3
- emoji: πŸ“Š
4
- colorFrom: blue
5
- colorTo: purple
6
- sdk: streamlit
7
- sdk_version: "1.30.0"
8
- app_file: main.py
9
- pinned: false
10
- ---
11
-
12
-
13
- # AI-Powered EDA & Feature Engineering Assistant
14
-
15
- ![App Banner](https://raw.githubusercontent.com/vashu2425/AI-Powered-EDA-Feature-Engineering-Assistant/main/assets/banner.png)
16
-
17
- An interactive application that uses AI to analyze datasets and provide comprehensive exploratory data analysis (EDA) insights and feature engineering recommendations.
18
-
19
- ## 🌟 Features
20
-
21
- - **πŸ€– AI-Powered Analysis**: Receive detailed EDA insights generated by Mistral-7B
22
- - **πŸ“Š Automated Visualizations**: Generate key visualizations with a single click
23
- - **πŸ”§ Feature Engineering Recommendations**: Get AI suggestions for improving your data
24
- - **⚠️ Data Quality Assessment**: Identify issues in your dataset and receive fixing advice
25
- - **πŸ’¬ Chat Interface**: Ask questions about your dataset and get AI-powered answers
26
- - **πŸŒ™ Dark Mode UI**: Sleek, modern dark interface for comfortable analysis
27
-
28
- ## πŸ“‹ Demo
29
-
30
- Here's a quick look at what you can do:
31
-
32
- 1. Upload a CSV dataset
33
- 2. Get automatic visualizations and statistics
34
- 3. Generate AI-powered insights for:
35
- - Exploratory Data Analysis
36
- - Feature Engineering Recommendations
37
- - Data Quality Assessment
38
- 4. Chat with your data to ask specific questions
39
-
40
- ## πŸ› οΈ Tech Stack
41
-
42
- - **Frontend**: Streamlit
43
- - **Data Processing**: Pandas, NumPy, Matplotlib, Seaborn
44
- - **AI Integration**: LangChain + Groq API
45
- - **LLM Model**: Llama3-8b-8192
46
-
47
- ## πŸ“¦ Installation
48
-
49
- ### Prerequisites
50
- - Python 3.8+
51
- - Anaconda or Miniconda (recommended)
52
- - Groq API key
53
-
54
- ### Setup
55
-
56
- 1. Clone the repository:
57
- ```bash
58
- git clone https://github.com/vashu2425/AI-Powered-EDA-Feature-Engineering-Assistant.git
59
- cd AI-Powered-EDA-Feature-Engineering-Assistant
60
- ```
61
-
62
- 2. Create and activate a conda environment:
63
- ```bash
64
- conda create -n ai_eda_env python=3.10
65
- conda activate ai_eda_env
66
- ```
67
-
68
- 3. Install the required packages:
69
- ```bash
70
- pip install -r requirements.txt
71
- ```
72
-
73
- 4. Create a `.env` file with your Groq API key:
74
- ```
75
- GROQ_API_KEY=your_groq_api_key_here
76
- ```
77
-
78
- ### ⚠️ Compatibility Note
79
-
80
- This application requires specific versions of NumPy (1.24.3) and pandas (1.5.3) to avoid binary compatibility issues. The requirements.txt file has been updated with these specific versions to ensure a smooth installation experience.
81
-
82
- ### πŸ”§ Troubleshooting
83
-
84
- If you encounter the following error:
85
- ```
86
- ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
87
- ```
88
-
89
- Try the following solutions:
90
-
91
- 1. Make sure you're using the exact versions specified in requirements.txt:
92
- ```bash
93
- pip install numpy==1.24.3 pandas==1.5.3
94
- ```
95
-
96
- 2. If you're using Streamlit version older than 1.11.0, you might need to update the code to replace `st.experimental_rerun()` with `st.rerun()`.
97
-
98
- 3. If you're still having issues, try creating a fresh conda environment with Python 3.10:
99
- ```bash
100
- conda create -n fresh_ai_eda_env python=3.10
101
- conda activate fresh_ai_eda_env
102
- pip install -r requirements.txt
103
- ```
104
-
105
- ## πŸš€ Usage
106
-
107
- 1. Activate the conda environment:
108
- ```bash
109
- conda activate ai_eda_env
110
- ```
111
-
112
- 2. Run the application:
113
- ```bash
114
- streamlit run main.py
115
- ```
116
-
117
- 3. Open your web browser and navigate to `http://localhost:8501`
118
-
119
- 4. Upload a CSV dataset and start exploring!
120
-
121
- ## πŸ“Š Example Analysis
122
-
123
- Here are some examples of insights you can get:
124
-
125
- - Comprehensive EDA insights about your dataset variables and distributions
126
- - Feature engineering ideas specific to your data
127
- - Data quality improvement recommendations
128
- - Visualizations including correlation heatmaps, distribution plots, and more
129
-
130
- ## 🀝 Contributing
131
-
132
- Contributions are welcome! Please feel free to submit a Pull Request.
133
-
134
- ## πŸ“ License
135
-
136
- This project is licensed under the MIT License - see the LICENSE file for details.
137
-
138
- ## πŸ“¬ Contact
139
-
140
- For any questions or feedback, please reach out to the repository owner.
141
-
142
- ---
143
-
144
- ### 🌟 Star this repository if you find it useful!