Spaces:

DOMMETI
/

From_Zero_to_ML_Hero

Sleeping

App Files Files Community

DOMMETI commited on Jan 27, 2025

Commit

7bde025

verified ·

1 Parent(s): 2f145ad

Update pages/3_Life_Cycle_Of_Ml.py

Browse files

Files changed (1) hide show

pages/3_Life_Cycle_Of_Ml.py +70 -54

pages/3_Life_Cycle_Of_Ml.py CHANGED Viewed

@@ -1,6 +1,4 @@
 import streamlit as st
-import pandas as pd
-import numpy as np
 # Apply custom CSS styling
 st.markdown("""
@@ -49,59 +47,77 @@ st.markdown("""
     </style>
     """, unsafe_allow_html=True)
-# Main title
-st.title("Lifecycle of a Machine Learning Project")
-# Steps of the ML lifecycle
-steps = [
-    "Problem Statement",
-    "Collect the Data",
-    "Simple EDA (Exploratory Data Analysis)",
-    "Data Processing",
-    "Original EDA",
-    "Feature Engineering",
-    "Training the Model",
-    "Testing the Model",
-    "Deployment",
-    "Monitoring",
-]
-# Sidebar navigation
-st.sidebar.title("Navigation")
-selected_step = st.sidebar.radio("Steps in ML Lifecycle", steps)
-if selected_step == "Problem Statement":
-    st.subheader("Define the Problem")
-elif selected_step == "Collect the Data":
-    st.subheader("Gather Relevant Data")
-elif selected_step == "Simple EDA (Exploratory Data Analysis)":
-    st.subheader("Initial Data Exploration")
-elif selected_step == "Data Processing":
-    st.subheader("Clean and Prepare Data")
-elif selected_step == "Original EDA":
-    st.subheader("Detailed Data Exploration")
-elif selected_step == "Feature Engineering":
-    st.subheader("Feature Engineering")
-elif selected_step == "Training the Model":
-    st.subheader("Train the Model")
-elif selected_step == "Testing the Model":
-    st.subheader("Evaluate Model Performance")
-elif selected_step == "Deployment":
-    st.subheader("Deploy the Model")
-elif selected_step == "Monitoring":
-    st.subheader("Monitor the Model")

 import streamlit as st
 # Apply custom CSS styling
 st.markdown("""
     </style>
     """, unsafe_allow_html=True)
+# Page Configuration
+st.set_page_config(page_title="Interactive NLP Guide", layout="wide")
+# Page Title
+st.markdown("<h1>Interactive NLP Guide</h1>", unsafe_allow_html=True)
+# Introduction Section
+st.markdown("<h2>Introduction to Natural Language Processing (NLP)</h2>", unsafe_allow_html=True)
+st.markdown("""
+<p>
+Natural Language Processing (NLP) is a field at the intersection of linguistics and computer science, focusing on the interaction between humans and machines via natural language. NLP powers applications such as:
+</p>
+<ul class="icon-bullet">
+    <li>Chatbots and Virtual Assistants</li>
+    <li>Machine Translation (e.g., Google Translate)</li>
+    <li>Text Summarization</li>
+    <li>Sentiment Analysis</li>
+    <li>Speech Recognition Systems</li>
+</ul>
+""", unsafe_allow_html=True)
+# Tokenization Section
+st.markdown("<h2>Tokenization</h2>", unsafe_allow_html=True)
+st.markdown("<h3>What is Tokenization?</h3>", unsafe_allow_html=True)
+st.markdown("""
+<p>
+Tokenization is the process of breaking down a text into smaller units, such as sentences or words, called tokens. It's the foundational step in any NLP pipeline.
+</p>
+""", unsafe_allow_html=True)
+st.markdown("""
+<h3>Types of Tokenization:</h3>
+<ul class="icon-bullet">
+    <li><strong>Word Tokenization:</strong> Splitting text into words (e.g., "I love NLP." → ["I", "love", "NLP"])</li>
+    <li><strong>Sentence Tokenization:</strong> Splitting text into sentences (e.g., "NLP is fascinating. It's the future." → ["NLP is fascinating.", "It's the future."])</li>
+</ul>
+""", unsafe_allow_html=True)
+# Example Code
+st.markdown("<h3>Code Example:</h3>", unsafe_allow_html=True)
+st.code("""
+from nltk.tokenize import word_tokenize, sent_tokenize
+text = "Natural Language Processing is exciting. Let's explore it!"
+word_tokens = word_tokenize(text)
+sentence_tokens = sent_tokenize(text)
+print("Word Tokens:", word_tokens)
+print("Sentence Tokens:", sentence_tokens)
+""", language="python")
+# Adding more sections
+st.markdown("<h2>Other NLP Techniques</h2>", unsafe_allow_html=True)
+st.markdown("""
+<p>
+As you explore NLP, here are other important techniques and their brief explanations:
+</p>
+<ul class="icon-bullet">
+    <li><strong>One-Hot Vectorization:</strong> A simple representation of text where each unique word is represented as a binary vector.</li>
+    <li><strong>Bag of Words:</strong> Represents text as the frequency of each word, disregarding word order.</li>
+    <li><strong>TF-IDF:</strong> Highlights important words by considering both frequency in a document and rarity across documents.</li>
+    <li><strong>Word Embeddings:</strong> Dense vector representations of words that capture their semantic meanings.</li>
+</ul>
+""", unsafe_allow_html=True)
+st.markdown("<h3>Key Takeaways:</h3>", unsafe_allow_html=True)
+st.markdown("""
+<ul class="icon-bullet">
+    <li>Tokenization is the foundation of most NLP tasks.</li>
+    <li>NLP techniques can transform unstructured text into structured formats for analysis.</li>
+    <li>Tools like NLTK, SpaCy, and Hugging Face make NLP accessible to developers and researchers.</li>
+</ul>
+""", unsafe_allow_html=True)