Mpavan45 commited on
Commit
8e54229
Β·
verified Β·
1 Parent(s): 41cb2b4

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +26 -5
app.py CHANGED
@@ -138,10 +138,31 @@ elif st.session_state.selected_page == "NLP Lifecycle":
138
 
139
  **Example**: Scraping customer reviews from Amazon to analyze sentiment and feedback about a product.
140
  """)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
141
 
142
  elif lifecycle_option == "Text Preprocessing":
143
  st.write("""
144
- #### 🧹 3. Text Preprocessing
145
  Text preprocessing prepares raw text for further analysis. This stage involves cleaning and transforming the data into a structured format that machine learning models can understand.
146
  - **Tokenization**: Splitting text into smaller units (e.g., words, phrases).
147
  - **Stop Words Removal**: Removing common words that don’t contribute much information.
@@ -157,7 +178,7 @@ elif st.session_state.selected_page == "NLP Lifecycle":
157
 
158
  elif lifecycle_option == "Text Representation":
159
  st.write("""
160
- #### πŸ“ 4. Text Representation
161
  After preprocessing, the text data needs to be converted into a numerical format for use in machine learning models. There are several methods for text representation:
162
  - **Bag of Words (BoW)**: Converts text into a matrix of word frequencies.
163
  - **TF-IDF**: Weighs words based on their frequency in a specific document relative to their frequency across the entire dataset.
@@ -170,7 +191,7 @@ elif st.session_state.selected_page == "NLP Lifecycle":
170
 
171
  elif lifecycle_option == "Model Training":
172
  st.write("""
173
- #### πŸ‹οΈβ€β™‚οΈ 5. Model Training
174
  In the model training stage, machine learning algorithms are trained on the preprocessed and represented text data. The choice of model depends on the task:
175
  - **Text Classification**: Naive Bayes, Support Vector Machines (SVM), or neural networks.
176
  - **Named Entity Recognition (NER)**: Conditional Random Fields (CRF), LSTMs, or transformers.
@@ -181,7 +202,7 @@ elif st.session_state.selected_page == "NLP Lifecycle":
181
 
182
  elif lifecycle_option == "Evaluation":
183
  st.write("""
184
- #### πŸ… 6. Evaluation
185
  After training the model, it's important to evaluate its performance using metrics such as accuracy, precision, recall, and F1-score.
186
  - **Accuracy**: The percentage of correct predictions.
187
  - **Precision**: The percentage of relevant instances among the retrieved instances.
@@ -193,7 +214,7 @@ elif st.session_state.selected_page == "NLP Lifecycle":
193
 
194
  elif lifecycle_option == "Deployment":
195
  st.write("""
196
- #### πŸš€ 7. Deployment
197
  Once the model is evaluated and tuned, it is deployed into production where it can be used by end users. Deployment involves:
198
  - **Integration** with web applications, chatbots, or other tools.
199
  - **API Development**: Exposing the model through an API for real-time predictions.
 
138
 
139
  **Example**: Scraping customer reviews from Amazon to analyze sentiment and feedback about a product.
140
  """)
141
+ elif lifecycle_option == "Simple EDA":
142
+ st.write("""
143
+ #### πŸ“Š 3. Simple EDA
144
+ Simple Exploratory Data Analysis (Simple EDA) provides a quick overview of the dataset. It focuses on understanding the basic structure, spotting missing values, checking data types, and visualizing distributions.
145
+ - **Basic Data Inspection**: Viewing data types, first few rows, and general structure.
146
+ - **Summary Statistics**: Quick summary of key metrics like mean, median, and standard deviation.
147
+ - **Basic Visualizations**: Simple charts like histograms and boxplots to explore variable distributions.
148
+ - **Missing Values Check**: Identifying columns with missing values.
149
+ - **Outlier Detection**: Visual identification of outliers.
150
+
151
+ **Example**: In a sales dataset:
152
+ - Basic Data Inspection:
153
+ - Shape of the dataset: (1000, 5)
154
+ - First few rows: [Sales, Marketing Spend, Date, etc.]
155
+ - Summary Statistics:
156
+ - Mean Sales: 1000
157
+ - Median Sales: 950
158
+ - Visualizations:
159
+ - Histogram for sales distribution
160
+ - Boxplot for outlier detection
161
+ """)
162
 
163
  elif lifecycle_option == "Text Preprocessing":
164
  st.write("""
165
+ #### 🧹 4. Text Preprocessing
166
  Text preprocessing prepares raw text for further analysis. This stage involves cleaning and transforming the data into a structured format that machine learning models can understand.
167
  - **Tokenization**: Splitting text into smaller units (e.g., words, phrases).
168
  - **Stop Words Removal**: Removing common words that don’t contribute much information.
 
178
 
179
  elif lifecycle_option == "Text Representation":
180
  st.write("""
181
+ #### πŸ“ 5. Text Representation
182
  After preprocessing, the text data needs to be converted into a numerical format for use in machine learning models. There are several methods for text representation:
183
  - **Bag of Words (BoW)**: Converts text into a matrix of word frequencies.
184
  - **TF-IDF**: Weighs words based on their frequency in a specific document relative to their frequency across the entire dataset.
 
191
 
192
  elif lifecycle_option == "Model Training":
193
  st.write("""
194
+ #### πŸ‹οΈβ€β™‚οΈ 6. Model Training
195
  In the model training stage, machine learning algorithms are trained on the preprocessed and represented text data. The choice of model depends on the task:
196
  - **Text Classification**: Naive Bayes, Support Vector Machines (SVM), or neural networks.
197
  - **Named Entity Recognition (NER)**: Conditional Random Fields (CRF), LSTMs, or transformers.
 
202
 
203
  elif lifecycle_option == "Evaluation":
204
  st.write("""
205
+ #### πŸ… 7. Evaluation
206
  After training the model, it's important to evaluate its performance using metrics such as accuracy, precision, recall, and F1-score.
207
  - **Accuracy**: The percentage of correct predictions.
208
  - **Precision**: The percentage of relevant instances among the retrieved instances.
 
214
 
215
  elif lifecycle_option == "Deployment":
216
  st.write("""
217
+ #### πŸš€ 8. Deployment
218
  Once the model is evaluated and tuned, it is deployed into production where it can be used by end users. Deployment involves:
219
  - **Integration** with web applications, chatbots, or other tools.
220
  - **API Development**: Exposing the model through an API for real-time predictions.