Spaces:
Build error
Build error
| import streamlit as st | |
| import math | |
| st.title(":red[**1 : INTRODUCTION TO STATISTICS**]") | |
| st.markdown("""_In this field we will be dealing with data by using programing language python. The term DATA | |
| ANALYSIS itself say’s that it will be dealing with data. In this we will be collecting the data and | |
| will be cleaning the data and then we will be analyzing the to get the insights from them. Now | |
| let us understand the term data._""") | |
| st.header("*What does term data refers to?*") | |
| st.subheader(":blue[DATA]") | |
| st.markdown("""Data is collection of information which is gathered from observation. There are wide | |
| sources of information. Some of the best examples of data are given below. \n * IMAGE is one of the best source of data. \n * TEXT is one of the best source of data. | |
| \n * VIDEO is one of the best source of data. \n * AUDIO is one of the best source of data. | |
| """) | |
| st.header("DATA is classified into 3-types.") | |
| st.subheader("Structured Data", divider=True) | |
| st.subheader("Unstructured Data", divider=True) | |
| st.subheader("Semi Structured Data", divider=True) | |
| st.subheader("**Structured Data**") | |
| st.markdown("""This type of data will be having a effective or well organized | |
| format.\nThis type of data is aligned in terms of row’s and column’s. Some of the best example’s of | |
| structured data are given below.\n * EXCEL DOCUMENT \n * STRUCTURED QUERY LANGUAGE DATABASE | |
| """) | |
| st.image('https://cdn-uploads.huggingface.co/production/uploads/64c972774515835c4dadd754/dSbyOXaQ6N_Kg2TLxgEyt.png', width=400) | |
| st.subheader("**Unstructured Data**") | |
| st.markdown("""This type of data will not be having any effective or well | |
| organized format. This type of data doesn’t have any row’s and column’s. Some of the best | |
| example’s of unstructured data are given below.\n * IMAGE\n * VIDEO\n * TEXT\n *Social Media Feeds | |
| """) | |
| st.image("https://cdn-uploads.huggingface.co/production/uploads/64c972774515835c4dadd754/xhaNBRanDaj8esumqo9hl.png", width=400) | |
| st.subheader("**Semi Structured Data**") | |
| st.markdown("""This type of data can be called as combination of | |
| structured data as well as unstructured data. Some of the best examples of semi structured | |
| data are given below.\n * COMMA SEPERATED VARIABLE\n *JSON FILES\n * E-MAILS\n * HTML | |
| """) | |
| st.image("https://cdn-uploads.huggingface.co/production/uploads/64c972774515835c4dadd754/Nupc6BePInRVo9gJwLfWH.png", width=400) | |
| st.title("2 : INTRODUCTION TO STATISTICS") | |
| st.markdown("""_The term statistics is a branch of mathematics and also can be called as a huge field in which | |
| we are going to deal with data which involves collecting, analyzing, interpreting, and | |
| structuring the data. Statistics is classified into two types. | |
| _""") | |
| st.subheader("Descriptive Statistics",divider=True) | |
| st.subheader("Inferential Statistics",divider=True) | |
| st.subheader("**Descriptive Statistics**") | |
| st.markdown("""This Descriptive Statistics describe the main feature of data. This | |
| descriptive statistics can be performed on sample data as well as population data. Some of | |
| the key points of descriptive statistics are stated below.\n KEY COCEPTS\n * Measurement of Central Tendency which involves finding Mean, Median, and Mode.\n * Measurement of Dispersion which involves finding Range, Variance and Standard Deviation.\n * Distribution which gives how frequently the data is occurring some of examples of distribution are Gaussian, Random, and Normal distribution""") | |
| st.subheader("Measure Of Central Tendency",divider=True) | |
| st.markdown("""The measure of central tendency is used to find the central average value of the data.The central tendency can be computed by | |
| useing three ways \n * Mode \n * Median \n * Mean""") | |
| st.subheader("MODE",divider=True) | |
| st.markdown("""Mode will be giving the centeral tendency based on most frequently occuring data.The major drawback of mode is its frequecy baised it | |
| mostly focus on the data which is occuring most times.Here in this mode we might come across some situation's like """) | |
| st.markdown(''':violet[No_Mode] \n Let's understand why this situation raises for example let's take list of numbers [1,2,3,4,5] here we don't have | |
| frequency of numbers repeating in this senario we will come accross No_Mode situaton. | |
| ''') | |
| st.markdown(''':violet[Uni_Mode] \n Let's understand why this situation raises for example let's take list of numbers [1,1,1,2,3,4,5]. here by | |
| checking the list it will tend to know that the frequency of number 1 is more and it returns the value 1 as output. | |
| ''') | |
| st.markdown(''':violet[Bi_Mode] \n Let's understand why this situation raises for example let's take list of numbers [1,1,2,2,3,4,5]. here by | |
| checking the frequency in list we come across a situtaion where we will find two maximun frequecy repeated value hence the output will be Bi_Mode. | |
| ''') | |
| st.markdown(''':violet[Tri_Mode] \n Let's understand why this situation raises for example let's take list of numbers [1,1,2,2,3,3,4,5]. here by | |
| checking the frequency in list we come across a situtaion where we will find three maximun frequecy repeated value hence the output will be Tri_Mode. | |
| ''') | |
| st.markdown(''':violet[Multi_Mode] \n Let's understand why this situation raises for example let's take list of numbers [1,1,2,2,3,3,4,4,5]. here by | |
| checking the frequency in list we come across a situtaion where we will find more than three maximun frequecy repeated value hence the output will be Multi_Mode. | |
| ''') | |
| def mode(*args): | |
| list1 = list(args) | |
| dict1 = {} | |
| dict2 = {} | |
| set1 = set(list1) | |
| for j in set1: | |
| dict1[j] = list1.count(j) | |
| max_value = max(dict1.values()) | |
| count = [key for key, value in dict1.items() if value == max_value] | |
| if max_value == 1: | |
| return 'no mode' | |
| elif len(count) == len(set1): | |
| return 'no mode' | |
| elif len(count) == 1: | |
| dict2[count[0]] = dict1.get(count[0]) | |
| return dict2 | |
| elif len(count) == 2: | |
| return 'bi mode' | |
| elif len(count) == 3: | |
| return 'tri mode' | |
| else: | |
| return 'multimode' | |
| # Collect user input | |
| numbers_input = st.text_input("Enter a list of numbers separated by commas (e.g., 1, 2, 2, 3, 4):") | |
| if numbers_input: | |
| # Convert the input string to a list of integers | |
| try: | |
| list1 = list(map(int, numbers_input.split(','))) | |
| # Call the mode function with the list of integers | |
| result = mode(*list1) | |
| # Display the result | |
| st.write("Mode result:", result) | |
| except ValueError: | |
| st.write("Please enter a valid list of numbers separated by commas.") | |
| st.subheader("Median",divider=True) | |
| st.markdown("""Median will also be giving the central tendency.But the major drawback of median is it prior foucus will be on the central value. | |
| In order to find the mean first we have to sort the give list and based on the length of the list the formula are changed""") | |
| st.subheader("Median Formula for Odd Number of Observations") | |
| st.latex(r''' | |
| \text{Median} = X_{\left(\frac{n+1}{2}\right)} | |
| ''') | |
| st.subheader("Median Formula for Even Number of Observations") | |
| st.latex(r''' | |
| \text{Median} = \frac{X_{\left(\frac{n}{2}\right)} + X_{\left(\frac{n}{2}+1\right)}}{2} | |
| ''') | |
| def median(list1): | |
| length=len(list1) | |
| if length%2== 0: | |
| even_median_value=((length/2)+(length/2)+1)/2 | |
| return (list1[int(even_median_value)]+ list1[int(even_median_value)-1])/2 | |
| else: | |
| odd_median=((length/2)+1) | |
| odd_median_value=math.floor(odd_median) | |
| return list1[odd_median_value-1] | |
| # Use a unique key for the text input widget to avoid DuplicateWidgetID error | |
| numbers_input_1 = st.text_input( | |
| "Enter a list of numbers separated by commas (e.g., 1, 2, 2, 3, 4):", | |
| key="numbers_input_1" | |
| ) | |
| list1=[] | |
| for i in numbers_input_1: | |
| list1+=[int(i)] | |
| result=median(list1) | |
| st.write("median_result", result) |