Phani1008 commited on
Commit
cf5d1c6
·
verified ·
1 Parent(s): a957b1e

Create 3_Life cycle of ML project.py

Browse files
Files changed (1) hide show
  1. pages/3_Life cycle of ML project.py +164 -0
pages/3_Life cycle of ML project.py ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+
3
+ # Set page configuration
4
+ st.set_page_config(page_title="ML Lifecycle", layout="centered")
5
+
6
+ # Custom CSS for dark theme design
7
+ st.markdown("""
8
+ <style>
9
+ body {
10
+ background-color: #121212;
11
+ color: #E0E0E0;
12
+ font-family: Arial, sans-serif;
13
+ }
14
+ .stApp {
15
+ background: #1F1F1F;
16
+ padding: 30px;
17
+ border-radius: 12px;
18
+ box-shadow: 0 4px 8px rgba(0, 0, 0, 0.5);
19
+ }
20
+ .stButton > button {
21
+ display: block;
22
+ margin: 12px auto;
23
+ width: 80%;
24
+ background: linear-gradient(90deg, #ff6b6b, #f06595);
25
+ color: white;
26
+ border: none;
27
+ padding: 12px 25px;
28
+ font-size: 16px;
29
+ border-radius: 8px;
30
+ font-weight: bold;
31
+ box-shadow: 0 4px 6px rgba(0, 0, 0, 0.3);
32
+ cursor: pointer;
33
+ transition: all 0.3s ease-in-out;
34
+ }
35
+ .stButton > button:hover {
36
+ background: linear-gradient(90deg, #f06595, #ff6b6b);
37
+ transform: translateY(-3px);
38
+ }
39
+ h1 {
40
+ color: #ff6b6b;
41
+ font-size: 40px;
42
+ text-align: center;
43
+ font-weight: bold;
44
+ margin-bottom: 20px;
45
+ }
46
+ h2, h3 {
47
+ color: #f06595;
48
+ text-align: center;
49
+ }
50
+ .hr {
51
+ border: 0;
52
+ height: 2px;
53
+ background: linear-gradient(to right, #ff6b6b, #f06595);
54
+ margin: 20px 0;
55
+ }
56
+ .arrow {
57
+ font-size: 24px;
58
+ text-align: center;
59
+ color: #f06595;
60
+ margin: 10px 0;
61
+ }
62
+ </style>
63
+ """, unsafe_allow_html=True)
64
+
65
+
66
+
67
+ # Initialize session state for page navigation
68
+ if 'page' not in st.session_state:
69
+ st.session_state.page = 'main'
70
+ if 'previous_page' not in st.session_state:
71
+ st.session_state.previous_page = 'main'
72
+
73
+ # Function to render the main page
74
+ def main_page():
75
+ st.title("🚀 Machine Learning Project Lifecycle")
76
+
77
+ steps = [
78
+ "1. Problem Statement",
79
+ "2. Data Collection",
80
+ "3. Simple EDA",
81
+ "4. Data Preprocessing",
82
+ "5. EDA",
83
+ "6. Feature Engineering",
84
+ "7. Training the Model",
85
+ "8. Testing the Model",
86
+ "9. Deployment",
87
+ "10. Monitoring"
88
+ ]
89
+
90
+ for i, step in enumerate(steps):
91
+ if st.button(step):
92
+ st.session_state.previous_page = 'main'
93
+ st.session_state.page = step.replace(".", "").replace(" ", "_").lower()
94
+ # Add a downward arrow between steps
95
+ if i < len(steps) - 1:
96
+ st.markdown("<p class='arrow'>⬇️</p>", unsafe_allow_html=True)
97
+
98
+ # Data Collection Page
99
+ def data_collection_page():
100
+ st.title("📦 Data Collection")
101
+
102
+ st.write("In this field, we will work with data using the Python programming language. The term Data Analysis indicates that it focuses on handling data. This involves gathering, cleaning, and then analyzing the data to extract valuable insights. Now, let's explore what data means.")
103
+
104
+ st.header("What is Data?")
105
+ st.write("In a simple definition we can say that data is a collection of information. And we can also say Facts or pieces of information that can be measured. It can be in various form such as")
106
+ st.markdown("- IMAGE 🖼️")
107
+ st.markdown("- TEXT 📝")
108
+ st.markdown("- VIDEO 📹")
109
+ st.markdown("- AUDIO 🔉")
110
+
111
+ st.write("Not all data is created equal. Data can come in various forms, and knowing how to classify it is essential for choosing the right tools and methods to analyze it. Broadly, data can be classified into three main categories:")
112
+ st.markdown("- Structured")
113
+ st.markdown("- Semi Structured")
114
+ st.markdown("- Unstructured")
115
+ st.write("Each type of data has its own characteristics, advantages, and challenges when it comes to processing and extracting insights.")
116
+
117
+ st.subheader("Structured Data:")
118
+ st.write("Structured data is the most organized and easily accessible form of data. It refers to information that is highly organized and formatted in a way that can be easily stored, accessed, and processed by machines. Think of structured data as data that fits neatly into rows and columns, like in a spreadsheet or a relational database.")
119
+ st.write("Examples : ")
120
+ st.markdown("- Databases: Tables in SQL databases where each column represents a different attribute (e.g., name, age, salary), and each row represents a record.")
121
+ st.markdown("- Excel Sheets: Rows and columns filled with categorical and numerical data.")
122
+ st.image('https://k21academy.com/wp-content/uploads/2020/10/structured-data-1.png', width=400)
123
+
124
+ st.subheader("Semi structured Data:")
125
+ st.write("Semi-structured data doesn’t fit as neatly into the traditional table format as structured data, but it still follows a certain organizational framework. This type of data contains tags, markers, or attributes that make it somewhat organized, but it doesn’t strictly conform to a table format.")
126
+ st.write("Examples :")
127
+ st.markdown("- JSON and XML Files")
128
+ st.markdown("- NoSQL Databases")
129
+ st.image("https://www.imediacto.com/wp-content/uploads/2021/02/xml-csv-json-data-formats.png",width=400)
130
+
131
+ st.subheader("Unstructured Data:")
132
+ st.write("Unstructured data is the most complex and least organized form of data. It does not follow a specific format or structure, making it difficult to process and analyze using traditional methods. Unstructured data includes a wide range of formats, such as text, images, videos, and more.")
133
+ st.write("Examples :")
134
+ st.markdown("- Text Files: Documents, emails, and written reports.")
135
+ st.markdown("- Multimedia: Photos, videos, and audio files.")
136
+ st.markdown("- Social Media: Tweets, posts, and comments on platforms like Facebook, Twitter, or Instagram.")
137
+ st.image("https://k21academy.com/wp-content/uploads/2020/10/unstructured.png",width = 600)
138
+
139
+
140
+ st.subheader("Data Collection Methods")
141
+ st.subheader("Dataset Websites")
142
+ st.write("""
143
+ - Explore platforms like Kaggle, Data.gov, and UCI Machine Learning Repository for relevant datasets.
144
+ """)
145
+
146
+ st.subheader("APIs")
147
+ st.write("""
148
+ - Utilize APIs offered by companies or organizations to access real-time structured data for analysis.
149
+ """)
150
+
151
+ st.subheader("Databases")
152
+ st.write("""
153
+ - Connect to relational or NoSQL databases where structured data is stored and retrieve the necessary information.
154
+ """)
155
+
156
+ st.subheader("Web Scraping")
157
+ st.write("""
158
+ - Extract information from websites using tools like BeautifulSoup or Scrapy to gather unstructured or semi-structured data.
159
+ """)
160
+
161
+ st.subheader("Manual Collection")
162
+ st.write("""
163
+ - Collect data manually through surveys, questionnaires, interviews, or direct observations.
164
+ """)