Harika22 commited on
Commit
d32aaba
·
verified ·
1 Parent(s): de5a7ff

Update pages/5_Pre-procesing_of_text.py

Browse files
Files changed (1) hide show
  1. pages/5_Pre-procesing_of_text.py +7 -7
pages/5_Pre-procesing_of_text.py CHANGED
@@ -96,18 +96,18 @@ st.markdown(
96
  <div class='section'>
97
  Converts raw data into pre-processed data
98
 
99
- - which has 2 benefits:
100
-
101
- - Reduce the dimensionality ---> to increase the performance of ML
102
-
103
- - Raw data - preprocessed data ---> required by the problem statement
104
  <ul>
105
- <li><b>Converting into particular case</b>So that highly we can reduce the dimensionalty.If the problem statement says that grammar should be preserved then no need of conversion</li>
106
  <li><b>Removing URL's / tags/mails/mentions</b>Converting or preserving information should be based on the problem statement</li>
107
  <li><b>Handling Emoji's</b>Emoji's data should be preserved</li>
108
  <li><b>Contractions and acronyms</b>Both the contractions and acronyms should be converted into general text</li>
109
  <li><b>Stop Words</b>Stop words make the grammar very clear</li>
110
- <li><b>Stemming and Lemmatization</b>Both are purely based on problm statement and if problem statement wants grammatical concept don't perform stemming</li>
111
  </ul>
112
  </div>
113
  ''',
 
96
  <div class='section'>
97
  Converts raw data into pre-processed data
98
 
99
+ which has 2 benefits:
100
+
101
+ Reduce the dimensionality ---> to increase the performance of ML
102
+
103
+ Raw data - preprocessed data ---> required by the problem statement
104
  <ul>
105
+ <li><b>Converting into particular case</b>So that highly we can reduce the dimensionalty,if the problem statement says that grammar should be preserved then no need of conversion</li>
106
  <li><b>Removing URL's / tags/mails/mentions</b>Converting or preserving information should be based on the problem statement</li>
107
  <li><b>Handling Emoji's</b>Emoji's data should be preserved</li>
108
  <li><b>Contractions and acronyms</b>Both the contractions and acronyms should be converted into general text</li>
109
  <li><b>Stop Words</b>Stop words make the grammar very clear</li>
110
+ <li><b>Stemming and Lemmatization</b>Both are purely based on problem statement and if problem statement wants grammatical concept don't perform stemming</li>
111
  </ul>
112
  </div>
113
  ''',