Spaces:
Sleeping
Sleeping
| import streamlit as st | |
| st.title(":blue[Introduction to Data Analysis]") | |
| st.caption("***From data dust to diamond insights — analysis is the alchemy***") | |
| st.subheader("What is Data Analysis?...",divider="green") | |
| multi = '''The process of inspecting the data , cleaning the data and transforming the data into meaningful sights from extracting the data that is collected. | |
| It is the process of systematically applying statistical, logical, and computational techniques to describe, summarize, and evaluate the data. | |
| ''' | |
| st.markdown(multi) | |
| st.subheader("Types of Data",divider="green") | |
| multi = '''For performing the data analysis we need to know the type of data that we collected . Majorly data is divided based on the pre-defined structure. | |
| Based on this data is classified into three types. | |
| ''' | |
| st.markdown(multi) | |
| multi=''':violet[1.Structured data]''' | |
| st.markdown(multi) | |
| multi=''':violet[2.Unstructured data]''' | |
| st.markdown(multi) | |
| multi=''':violet[3.Semi-Structured data]''' | |
| st.markdown(multi) | |
| st.subheader("1.Structured Data",divider="red") | |
| multi = '''Structured data is well-formatted and organized data. | |
| It is usually in tabular format known as RDBMS("Relational Database Management System") where the data is stored in rows and columns. | |
| It is easy to search and typically known as quantitative data. | |
| Examples of structured data is - Excel files(.xlsx), SQL files etc... | |
| ''' | |
| st.image("https://cdn-uploads.huggingface.co/production/uploads/66bde9bf3c885d04498227a0/ewYq-ld-Fr7SCE7Th0idQ.png") | |
| st.markdown(multi) | |
| st.subheader("2.Unstructured Data",divider="red") | |
| multi = '''Unstructured data is not pre-definely formatted and organized data. | |
| This type of data doesn't fit into rows and columns it is combination of text, images and audio etc.. | |
| It is not easy to analyse and perform the analysis typically known as qualitative data. | |
| Examples of unstructured data is - Text, images, audios, videos etc... | |
| ''' | |
| st.image("https://cdn-uploads.huggingface.co/production/uploads/66bde9bf3c885d04498227a0/o96nGe5pQ7EkbXTdjOkpW.png") | |
| st.markdown(multi) | |
| st.subheader("3.Semistructured Data",divider="red") | |
| multi = '''Semi structured data is a hybrid of structured and unstructured data. | |
| As the data is combination of both it is much more difficult for analysis. | |
| Examples of semi-structured data is - csv files, json files and xml files | |
| ''' | |
| st.image("https://cdn-uploads.huggingface.co/production/uploads/66bde9bf3c885d04498227a0/Gz_AZKg8M7e9K96TsVenU.png") | |
| st.markdown(multi) | |
| st.header("**Steps in Data Analysis**",divider="green") | |
| st.markdown('''Basically there are 7 steps involved to perform complete data analysis''') | |
| st.subheader(":blue[1.Problem Statement]") | |
| multi = '''Basically when the data is collected and need to perform data analysis the first step is problem statement - it is concise description for the problem needs to be solved. | |
| It gives major blueprint for the data analysis as it clearly identifies the specific issue that needs to be addressed. | |
| ''' | |
| st.markdown(multi) | |
| st.subheader(":blue[2.Data Collection]") | |
| multi = '''After analyzing the major issue that needs to be addressed we need to collect the data which is related to the particular issue . | |
| Data needs to collected in differenr formats from many sources, websites etc... so that we can perform analysis in easier way. | |
| We can gather data or collect the data from previous one with the help of stake holders and domain experts. | |
| ''' | |
| st.markdown(multi) | |
| st.subheader(":blue[3.Simple EDA(Exploratory Data Analysis)]") | |
| multi = '''After collecting the data we need to check whether the collected data has any impurities or not. | |
| For that we need simple EDA which gives the information about collected data has any impurities or not. | |
| If the collected data doesn’t have any impurities then directly go for EDA phase else it goes to pre-processing phase | |
| ''' | |
| st.markdown(multi) | |
| st.subheader(":blue[4.Pre-processing]") | |
| multi = '''If the collected data has any impurities it performs cleaning the data and then transforming the data. | |
| It cleans any sort of impurities and performs cleaning process. | |
| Raw data ---> Cleaned data''' | |
| st.markdown(multi) | |
| st.subheader(":blue[5.EDA]") | |
| multi = '''After the pre-processing phase the data goes through EDA process which unveil all the hidden insights from the data''' | |
| st.markdown(multi) | |
| st.subheader(":blue[6.Visualization]") | |
| multi='''After the insights are found from the collected data - the insights goes through the many visualization techniques as they are represented further in dashboard format | |
| ''' | |
| st.markdown(multi) | |
| st.subheader(":blue[7.Story Telling]") | |
| multi = '''Final step in the data analysis as it is foremost important because the client doen't understand the the data that is in dashboard format ... | |
| So we need to explain or analyse the clients so that they can understand the data .So majorly deployment plays major role in data analysis''' | |
| st.markdown(multi) |