Mpavan45 commited on
Commit
90478e8
·
verified ·
1 Parent(s): 638a4e0

Create EDA and Feature Engineering.py

Browse files
pages/EDA and Feature Engineering.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import pandas as pd
3
+
4
+ # EDA and Feature Engineering Page
5
+ st.title("EDA and Feature Engineering")
6
+ st.markdown("""
7
+ This section is dedicated to exploratory data analysis (EDA) and feature engineering.
8
+ You can upload your dataset and analyze it here.
9
+ """)
10
+
11
+ # File uploader for dataset
12
+ uploaded_file = st.file_uploader("Upload your dataset (CSV format):", type=["csv"])
13
+
14
+ if uploaded_file is not None:
15
+ # Read and display the dataset
16
+ data = pd.read_csv(uploaded_file)
17
+ st.write("### Uploaded Dataset:")
18
+ st.dataframe(data)
19
+
20
+ # Overview of the dataset
21
+ st.write("### Dataset Overview:")
22
+ st.write(data.describe())
23
+
24
+ # Missing values in the dataset
25
+ st.write("### Missing Values:")
26
+ st.write(data.isnull().sum())
27
+
28
+ # Correlation matrix
29
+ st.write("### Correlation Matrix:")
30
+ st.write(data.corr())
31
+
32
+ st.markdown("""
33
+ Based on the insights from this analysis, you can proceed to perform feature engineering by:
34
+ - Handling missing values.
35
+ - Creating or transforming features.
36
+ - Encoding categorical variables.
37
+ - Normalizing or scaling numerical features.
38
+ """)
39
+ else:
40
+ st.warning("Please upload a dataset to proceed with EDA.")