Update pages/5_Pre-procesing_of_text.py
Browse files
pages/5_Pre-procesing_of_text.py
CHANGED
|
@@ -75,12 +75,6 @@ st.markdown("β
**Grammar Preservation** β If grammar is needed, avoid removi
|
|
| 75 |
|
| 76 |
st.success("π Well-structured and clean text significantly boosts ML model performance!")
|
| 77 |
|
| 78 |
-
st.markdown(
|
| 79 |
-
"""
|
| 80 |
-
<div class='caption'>Step into the world of NLP and discover the endless possibilities of language-driven innovation!</div>
|
| 81 |
-
""",
|
| 82 |
-
unsafe_allow_html=True,
|
| 83 |
-
)
|
| 84 |
|
| 85 |
st.markdown("<div class='section'>", unsafe_allow_html=True)
|
| 86 |
st.markdown("<h2 class='title'>π NLP Data Preprocessing</h2>", unsafe_allow_html=True)
|
|
@@ -91,6 +85,16 @@ st.success("π **Benefits of Preprocessing:**\n\nβ
Reduces dimensionality\n\
|
|
| 91 |
|
| 92 |
st.markdown("### β¨ **Essential Preprocessing Steps:**")
|
| 93 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
st.markdown("β
**Converting Text Case** β Reduces dimensionality; case conversion depends on problem statement.")
|
| 95 |
st.markdown("β
**Removing URLs, Tags, and Mentions** β Retain only if required by the problem statement.")
|
| 96 |
st.markdown("β
**Handling Emojis** β Preserve or convert emoji data based on context.")
|
|
@@ -98,4 +102,14 @@ st.markdown("β
**Expanding Contractions & Acronyms** β Convert abbreviations
|
|
| 98 |
st.markdown("β
**Stop Words Removal** β Optional, useful for text simplification.")
|
| 99 |
st.markdown("β
**Stemming & Lemmatization** β Perform only if grammar is **not** crucial for analysis.")
|
| 100 |
|
| 101 |
-
st.markdown("</div>", unsafe_allow_html=True)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
|
| 76 |
st.success("π Well-structured and clean text significantly boosts ML model performance!")
|
| 77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
|
| 79 |
st.markdown("<div class='section'>", unsafe_allow_html=True)
|
| 80 |
st.markdown("<h2 class='title'>π NLP Data Preprocessing</h2>", unsafe_allow_html=True)
|
|
|
|
| 85 |
|
| 86 |
st.markdown("### β¨ **Essential Preprocessing Steps:**")
|
| 87 |
|
| 88 |
+
st.markdown(
|
| 89 |
+
"""
|
| 90 |
+
<div class='image-container'>
|
| 91 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/66bde9bf3c885d04498227a0/HtdtNm-UJdfN057BeKSgV.png",width=400>
|
| 92 |
+
</div>
|
| 93 |
+
""",
|
| 94 |
+
unsafe_allow_html=True,
|
| 95 |
+
)
|
| 96 |
+
|
| 97 |
+
|
| 98 |
st.markdown("β
**Converting Text Case** β Reduces dimensionality; case conversion depends on problem statement.")
|
| 99 |
st.markdown("β
**Removing URLs, Tags, and Mentions** β Retain only if required by the problem statement.")
|
| 100 |
st.markdown("β
**Handling Emojis** β Preserve or convert emoji data based on context.")
|
|
|
|
| 102 |
st.markdown("β
**Stop Words Removal** β Optional, useful for text simplification.")
|
| 103 |
st.markdown("β
**Stemming & Lemmatization** β Perform only if grammar is **not** crucial for analysis.")
|
| 104 |
|
| 105 |
+
st.markdown("</div>", unsafe_allow_html=True)
|
| 106 |
+
|
| 107 |
+
|
| 108 |
+
|
| 109 |
+
|
| 110 |
+
st.markdown(
|
| 111 |
+
"""
|
| 112 |
+
<div class='caption'>Step into the world of NLP and discover the endless possibilities of language-driven innovation!</div>
|
| 113 |
+
""",
|
| 114 |
+
unsafe_allow_html=True,
|
| 115 |
+
)
|