Spaces:

Mpavan45
/

NLP_Blog

Build error

App Files Files Community

Mpavan45 commited on Dec 21, 2024

Commit

8e54229

verified ·

1 Parent(s): 41cb2b4

Update app.py

Browse files

Files changed (1) hide show

app.py +26 -5

app.py CHANGED Viewed

@@ -138,10 +138,31 @@ elif st.session_state.selected_page == "NLP Lifecycle":
         **Example**: Scraping customer reviews from Amazon to analyze sentiment and feedback about a product.
         """)
     elif lifecycle_option == "Text Preprocessing":
         st.write("""
-        #### 🧹 3. Text Preprocessing
         Text preprocessing prepares raw text for further analysis. This stage involves cleaning and transforming the data into a structured format that machine learning models can understand.
         - **Tokenization**: Splitting text into smaller units (e.g., words, phrases).
         - **Stop Words Removal**: Removing common words that don’t contribute much information.
@@ -157,7 +178,7 @@ elif st.session_state.selected_page == "NLP Lifecycle":
     elif lifecycle_option == "Text Representation":
         st.write("""
-        #### 📝 4. Text Representation
         After preprocessing, the text data needs to be converted into a numerical format for use in machine learning models. There are several methods for text representation:
         - **Bag of Words (BoW)**: Converts text into a matrix of word frequencies.
         - **TF-IDF**: Weighs words based on their frequency in a specific document relative to their frequency across the entire dataset.
@@ -170,7 +191,7 @@ elif st.session_state.selected_page == "NLP Lifecycle":
     elif lifecycle_option == "Model Training":
         st.write("""
-        #### 🏋️‍♂️ 5. Model Training
         In the model training stage, machine learning algorithms are trained on the preprocessed and represented text data. The choice of model depends on the task:
         - **Text Classification**: Naive Bayes, Support Vector Machines (SVM), or neural networks.
         - **Named Entity Recognition (NER)**: Conditional Random Fields (CRF), LSTMs, or transformers.
@@ -181,7 +202,7 @@ elif st.session_state.selected_page == "NLP Lifecycle":
     elif lifecycle_option == "Evaluation":
         st.write("""
-        #### 🏅 6. Evaluation
         After training the model, it's important to evaluate its performance using metrics such as accuracy, precision, recall, and F1-score.
         - **Accuracy**: The percentage of correct predictions.
         - **Precision**: The percentage of relevant instances among the retrieved instances.
@@ -193,7 +214,7 @@ elif st.session_state.selected_page == "NLP Lifecycle":
     elif lifecycle_option == "Deployment":
         st.write("""
-        #### 🚀 7. Deployment
         Once the model is evaluated and tuned, it is deployed into production where it can be used by end users. Deployment involves:
         - **Integration** with web applications, chatbots, or other tools.
         - **API Development**: Exposing the model through an API for real-time predictions.

         **Example**: Scraping customer reviews from Amazon to analyze sentiment and feedback about a product.
         """)
+    elif lifecycle_option == "Simple EDA":
+        st.write("""
+        #### 📊 3. Simple EDA
+        Simple Exploratory Data Analysis (Simple EDA) provides a quick overview of the dataset. It focuses on understanding the basic structure, spotting missing values, checking data types, and visualizing distributions.
+        - **Basic Data Inspection**: Viewing data types, first few rows, and general structure.
+        - **Summary Statistics**: Quick summary of key metrics like mean, median, and standard deviation.
+        - **Basic Visualizations**: Simple charts like histograms and boxplots to explore variable distributions.
+        - **Missing Values Check**: Identifying columns with missing values.
+        - **Outlier Detection**: Visual identification of outliers.
+        **Example**: In a sales dataset:
+        - Basic Data Inspection:
+            - Shape of the dataset: (1000, 5)
+            - First few rows: [Sales, Marketing Spend, Date, etc.]
+        - Summary Statistics:
+            - Mean Sales: 1000
+            - Median Sales: 950
+        - Visualizations:
+            - Histogram for sales distribution
+            - Boxplot for outlier detection
+        """)
     elif lifecycle_option == "Text Preprocessing":
         st.write("""
+        #### 🧹 4. Text Preprocessing
         Text preprocessing prepares raw text for further analysis. This stage involves cleaning and transforming the data into a structured format that machine learning models can understand.
         - **Tokenization**: Splitting text into smaller units (e.g., words, phrases).
         - **Stop Words Removal**: Removing common words that don’t contribute much information.
     elif lifecycle_option == "Text Representation":
         st.write("""
+        #### 📝 5. Text Representation
         After preprocessing, the text data needs to be converted into a numerical format for use in machine learning models. There are several methods for text representation:
         - **Bag of Words (BoW)**: Converts text into a matrix of word frequencies.
         - **TF-IDF**: Weighs words based on their frequency in a specific document relative to their frequency across the entire dataset.
     elif lifecycle_option == "Model Training":
         st.write("""
+        #### 🏋️‍♂️ 6. Model Training
         In the model training stage, machine learning algorithms are trained on the preprocessed and represented text data. The choice of model depends on the task:
         - **Text Classification**: Naive Bayes, Support Vector Machines (SVM), or neural networks.
         - **Named Entity Recognition (NER)**: Conditional Random Fields (CRF), LSTMs, or transformers.
     elif lifecycle_option == "Evaluation":
         st.write("""
+        #### 🏅 7. Evaluation
         After training the model, it's important to evaluate its performance using metrics such as accuracy, precision, recall, and F1-score.
         - **Accuracy**: The percentage of correct predictions.
         - **Precision**: The percentage of relevant instances among the retrieved instances.
     elif lifecycle_option == "Deployment":
         st.write("""
+        #### 🚀 8. Deployment
         Once the model is evaluated and tuned, it is deployed into production where it can be used by end users. Deployment involves:
         - **Integration** with web applications, chatbots, or other tools.
         - **API Development**: Exposing the model through an API for real-time predictions.