LakshmiHarika commited on
Commit
3271caa
·
verified ·
1 Parent(s): dc4fcbc

Update pages/Data Collection.py

Browse files
Files changed (1) hide show
  1. pages/Data Collection.py +52 -3
pages/Data Collection.py CHANGED
@@ -136,7 +136,6 @@ if data_type == "Structured Data":
136
  if st.button("Explore Excel"):
137
  st.write("Redirecting to Excel page...")
138
 
139
- # Unstructured Data Section
140
  elif data_type == "Unstructured Data":
141
  st.markdown("""
142
  <div style="text-align: left; margin-top: 20px;">
@@ -151,6 +150,8 @@ elif data_type == "Unstructured Data":
151
  """, unsafe_allow_html=True)
152
  st.write("""
153
  **Unstructured data** refers to information that does not follow a predefined format or structure.
 
 
154
  """)
155
 
156
  st.markdown("""
@@ -160,7 +161,21 @@ elif data_type == "Unstructured Data":
160
  """, unsafe_allow_html=True)
161
  st.write("""
162
  - Does not follow a specific schema or structure.
163
- - Requires advanced tools for analysis.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  """)
165
 
166
  st.markdown("""
@@ -169,6 +184,14 @@ elif data_type == "Unstructured Data":
169
  </div>
170
  """, unsafe_allow_html=True)
171
 
 
 
 
 
 
 
 
 
172
  if st.button("Unstructured Data Formats"):
173
  st.write("Select a format to explore:")
174
 
@@ -191,6 +214,7 @@ elif data_type == "Unstructured Data":
191
  if st.button("Explore Text"):
192
  st.write("Redirecting to Text page...")
193
 
 
194
  # Semi-Structured Data Section
195
  elif data_type == "Semi-Structured Data":
196
  st.markdown("""
@@ -205,7 +229,32 @@ elif data_type == "Semi-Structured Data":
205
  </div>
206
  """, unsafe_allow_html=True)
207
  st.write("""
208
- **Semi-Structured data** refers to information that contains markers or tags for structure but is not stored in a strict tabular format.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
209
  """)
210
 
211
  st.markdown("""
 
136
  if st.button("Explore Excel"):
137
  st.write("Redirecting to Excel page...")
138
 
 
139
  elif data_type == "Unstructured Data":
140
  st.markdown("""
141
  <div style="text-align: left; margin-top: 20px;">
 
150
  """, unsafe_allow_html=True)
151
  st.write("""
152
  **Unstructured data** refers to information that does not follow a predefined format or structure.
153
+ It is typically raw data that lacks a clear, organized schema, making it harder to store and analyze using traditional tools.
154
+ Examples include multimedia files (images, videos, audio), emails, and social media posts.
155
  """)
156
 
157
  st.markdown("""
 
161
  """, unsafe_allow_html=True)
162
  st.write("""
163
  - Does not follow a specific schema or structure.
164
+ - Cannot be stored in traditional tabular formats like rows and columns.
165
+ - Requires advanced tools like machine learning or natural language processing (NLP) for analysis.
166
+ """)
167
+
168
+ st.markdown("""
169
+ <div style="text-align: left; margin-top: 20px;">
170
+ <h4 style="color: #5b2c6f;">Example:</h4>
171
+ </div>
172
+ """, unsafe_allow_html=True)
173
+ st.write("""
174
+ Examples of unstructured data include:
175
+ - **Images**: Photos, screenshots, or scanned documents.
176
+ - **Audio**: Podcasts, voice recordings, or music files.
177
+ - **Videos**: Recorded lectures, surveillance footage, or YouTube videos.
178
+ - **Text**: Emails, social media posts, and blog articles.
179
  """)
180
 
181
  st.markdown("""
 
184
  </div>
185
  """, unsafe_allow_html=True)
186
 
187
+ st.write("""
188
+ **Data Formats:**
189
+ - **Images**: Formats like JPEG, PNG, BMP, and TIFF.
190
+ - **Audio**: Formats like MP3, WAV, and FLAC.
191
+ - **Videos**: Formats like MP4, AVI, and MKV.
192
+ - **Text**: Formats like TXT, LOG, and DOCX.
193
+ """)
194
+
195
  if st.button("Unstructured Data Formats"):
196
  st.write("Select a format to explore:")
197
 
 
214
  if st.button("Explore Text"):
215
  st.write("Redirecting to Text page...")
216
 
217
+
218
  # Semi-Structured Data Section
219
  elif data_type == "Semi-Structured Data":
220
  st.markdown("""
 
229
  </div>
230
  """, unsafe_allow_html=True)
231
  st.write("""
232
+ **Semi-Structured data** refers to information that does not follow a strict tabular format but contains tags or markers to separate data elements.
233
+ This type of data is more flexible than structured data but still organized enough to allow for easier analysis than unstructured data.
234
+ """)
235
+
236
+ st.markdown("""
237
+ <div style="text-align: left; margin-top: 20px;">
238
+ <h4 style="color: #5b2c6f;">Characteristics:</h4>
239
+ </div>
240
+ """, unsafe_allow_html=True)
241
+ st.write("""
242
+ - Contains markers or tags (e.g., XML, JSON keys) to provide structure.
243
+ - More flexible than structured data, allowing for varying schemas.
244
+ - Easier to process than unstructured data.
245
+ """)
246
+
247
+ st.markdown("""
248
+ <div style="text-align: left; margin-top: 20px;">
249
+ <h4 style="color: #5b2c6f;">Examples:</h4>
250
+ </div>
251
+ """, unsafe_allow_html=True)
252
+ st.write("""
253
+ Examples of semi-structured data include:
254
+ - **CSV**: Comma-separated values in plain-text files.
255
+ - **JSON**: A lightweight data-interchange format used in web applications.
256
+ - **XML**: Extensible Markup Language for structured document encoding.
257
+ - **HTML**: Markup language for web pages.
258
  """)
259
 
260
  st.markdown("""