Spaces:

Mpavan45
/

NLP_Blog

Build error

App Files Files Community

Mpavan45 commited on Dec 21, 2024

Commit

c495b41

verified ·

1 Parent(s): fc5491b

Update app.py

Browse files

Files changed (1) hide show

app.py +5 -6

app.py CHANGED Viewed

@@ -187,19 +187,18 @@ elif st.session_state.selected_page == "NLP Lifecycle":
     elif lifecycle_option == "Simple EDA":
         st.write("""
             #### 📊 3. Simple EDA
             #### Checking Data Balance
-            Before proceeding with analysis, it's important to evaluate whether the dataset is **balanced or imbalanced**. This involves examining the distribution of classes or categories in the data. By calculating the count or percentage of instances in each class, we can determine if the data is evenly distributed or if certain classes are underrepresented. Addressing imbalanced datasets is crucial to ensure reliable analysis and modeling.
             **Example**: In a classification dataset:
             - Class Distribution:
                 - Class A: 700 instances
                 - Class B: 300 instances
             - The dataset shows a 70:30 imbalance, which may require techniques like oversampling, undersampling, or synthetic data generation to correct.
-            #### Simple Exploratory Data Analysis (Simple EDA)
-            Simple EDA provides a high-level understanding of the dataset and its characteristics. It focuses on summarizing key features, identifying potential issues, and visualizing distributions to inform further analysis.
             - **Basic Data Inspection**: Examine data types, view the first few rows, and understand the overall structure.
             - **Summary Statistics**: Calculate key metrics like mean, median, and standard deviation to summarize numerical variables.
             - **Basic Visualizations**: Use histograms, boxplots, and scatterplots to explore data distributions and relationships.

     elif lifecycle_option == "Simple EDA":
         st.write("""
             #### 📊 3. Simple EDA
+           #### Simple Exploratory Data Analysis (Simple EDA)
+            Simple EDA provides a high-level understanding of the dataset and its characteristics. It focuses on summarizing key features, identifying potential issues, and visualizing distributions to inform further analysis.
             #### Checking Data Balance
+            Before proceeding with the analysis, it's essential to assess whether the dataset is balanced or imbalanced by using simple EDA (Exploratory Data Analysis). This involves examining the distribution of classes or categories in the data. By calculating the count or percentage of instances in each class, we can determine if the data is evenly distributed or if certain classes are underrepresented. Addressing class imbalance is important to ensure that the analysis and modeling processes are reliable and accurate.
             **Example**: In a classification dataset:
             - Class Distribution:
                 - Class A: 700 instances
                 - Class B: 300 instances
             - The dataset shows a 70:30 imbalance, which may require techniques like oversampling, undersampling, or synthetic data generation to correct.
+            #### Steps to Understand and Explore Your Data
             - **Basic Data Inspection**: Examine data types, view the first few rows, and understand the overall structure.
             - **Summary Statistics**: Calculate key metrics like mean, median, and standard deviation to summarize numerical variables.
             - **Basic Visualizations**: Use histograms, boxplots, and scatterplots to explore data distributions and relationships.