Spaces:
Sleeping
Sleeping
Update pages/1_Introduction to Data_Analysis.py
Browse files
pages/1_Introduction to Data_Analysis.py
CHANGED
|
@@ -40,4 +40,38 @@ Examples of semi-structured data is - csv files, json files and xml files
|
|
| 40 |
'''
|
| 41 |
st.image("https://cdn-uploads.huggingface.co/production/uploads/66bde9bf3c885d04498227a0/Gz_AZKg8M7e9K96TsVenU.png")
|
| 42 |
st.markdown(multi)
|
| 43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
'''
|
| 41 |
st.image("https://cdn-uploads.huggingface.co/production/uploads/66bde9bf3c885d04498227a0/Gz_AZKg8M7e9K96TsVenU.png")
|
| 42 |
st.markdown(multi)
|
| 43 |
+
st.header("**Steps in Data Analysis**")
|
| 44 |
+
st.markdown('''Basically there are 7 steps involved to perform complete data analysis''')
|
| 45 |
+
st.subheader(":blue[1.Problem Statement]")
|
| 46 |
+
multi = '''Basically when the data is collected and need to perform data analysis the first step is problem statement - it is concise description for the problem needs to be solved.
|
| 47 |
+
It gives major blueprint for the data analysis as it clearly identifies the specific issue that needs to be addressed.
|
| 48 |
+
'''
|
| 49 |
+
st.markdown(multi)
|
| 50 |
+
st.subheader(":blue[2.Data Collection]")
|
| 51 |
+
multi = '''After analyzing the major issue that needs to be addressed we need to collect the data which is related to the particular issue .
|
| 52 |
+
Data needs to collected in differenr formats from many sources, websites etc... so that we can perform analysis in easier way.
|
| 53 |
+
We can gather data or collect the data from previous one with the help of stake holders and domain experts.
|
| 54 |
+
'''
|
| 55 |
+
st.markdown(multi)
|
| 56 |
+
st.subheader(":blue[3.Simple EDA(Exploratory Data Analysis)")
|
| 57 |
+
multi = '''After collecting the data we need to check whether the collected data has any impurities or not.
|
| 58 |
+
For that we need simple EDA which gives the information about collected data has any impurities or not.
|
| 59 |
+
If the collected data doesn’t have any impurities then directly go for EDA phase else it goes to pre-processing phase
|
| 60 |
+
'''
|
| 61 |
+
st.markdown(multi)
|
| 62 |
+
st.subheader(":blue[4.Pre-processing]")
|
| 63 |
+
multi = '''If the collected data has any impurities it performs cleaning the data and then transforming the data.
|
| 64 |
+
It cleans any sort of impurities and performs cleaning process.
|
| 65 |
+
Raw data ---> Cleaned data'''
|
| 66 |
+
st.markdown(multi)
|
| 67 |
+
st.subheader(":blue[5.EDA]")
|
| 68 |
+
multi = '''After the pre-processing phase the data goes through EDA process which unveil all the hidden insights from the data'''
|
| 69 |
+
st.markdown(multi)
|
| 70 |
+
st.subheader(":blue[6.Visualization]")
|
| 71 |
+
multi='''After the insights are found from the collected data - the insights goes through the many visualization techniques as they are represented further in dashboard format
|
| 72 |
+
'''
|
| 73 |
+
st.markdown(multi)
|
| 74 |
+
st.subheader(":blue[7.Story Telling]")
|
| 75 |
+
multi = '''Final step in the data analysis as it is foremost important because the client doen't understand the the data that is in dashboard format ...
|
| 76 |
+
So we need to explain or analyse the clients so that they can understand the data .So majorly deployment plays major role in data analysis'''
|
| 77 |
+
st.markdown(multi)
|