Update pages/7_Advance_vectorization_techniques.py
Browse files
pages/7_Advance_vectorization_techniques.py
CHANGED
|
@@ -144,4 +144,67 @@ st.markdown('''
|
|
| 144 |
''')
|
| 145 |
|
| 146 |
if file_type == "Word2Vec":
|
| 147 |
-
st.title(":red[Word2Vec]")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 144 |
''')
|
| 145 |
|
| 146 |
if file_type == "Word2Vec":
|
| 147 |
+
st.title(":red[Word2Vec]")
|
| 148 |
+
st.markdown(
|
| 149 |
+
"""
|
| 150 |
+
<div class='box'>
|
| 151 |
+
<h3 style='color: #6A0572;'>π How Word2Vec Works?</h3>
|
| 152 |
+
<ul>
|
| 153 |
+
<li>After <strong>training</strong>, we obtain the final <span class='highlight'>Word2Vec model</span></li>
|
| 154 |
+
<li>The model stores a <strong>dictionary</strong> with word-vector pairs:</li>
|
| 155 |
+
</ul>
|
| 156 |
+
<pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
|
| 157 |
+
{ w1: [v1], w2: [v2], w3: [v3] }
|
| 158 |
+
</pre>
|
| 159 |
+
</div>
|
| 160 |
+
""",
|
| 161 |
+
unsafe_allow_html=True,
|
| 162 |
+
)
|
| 163 |
+
st.markdown(
|
| 164 |
+
"""
|
| 165 |
+
<div class='box'>
|
| 166 |
+
<h3 style='color: #6A0572;'>βοΈ Training vs. Test Time</h3>
|
| 167 |
+
<ul>
|
| 168 |
+
<li><strong>Training Time</strong>: <span class='highlight'>Corpus + Deep Learning Algorithm</span> β Generates Model</li>
|
| 169 |
+
<li><strong>Test Time</strong>: <span class='highlight'>Word</span> β Looked up in Dictionary β Returns <span class='highlight'>Vector Representation</span></li>
|
| 170 |
+
</ul>
|
| 171 |
+
</div>
|
| 172 |
+
""",
|
| 173 |
+
unsafe_allow_html=True,
|
| 174 |
+
)
|
| 175 |
+
|
| 176 |
+
st.markdown(
|
| 177 |
+
"""
|
| 178 |
+
<div class='box'>
|
| 179 |
+
<h3 style='color: #6A0572;'>π How Does It Preserve Meaning?</h3>
|
| 180 |
+
<ul>
|
| 181 |
+
<li>It learns from the <strong>context</strong> of words in the <span class='highlight'>corpus</span></li>
|
| 182 |
+
<li>When given a word, it checks in the dictionary and retrieves the <strong>semantic vector</strong></li>
|
| 183 |
+
<li>Unlike other models, <span class='highlight'>dimensions are not words</span>, but their meanings</li>
|
| 184 |
+
</ul>
|
| 185 |
+
</div>
|
| 186 |
+
""",
|
| 187 |
+
unsafe_allow_html=True,
|
| 188 |
+
)
|
| 189 |
+
|
| 190 |
+
st.markdown(
|
| 191 |
+
"""
|
| 192 |
+
<div class='box'>
|
| 193 |
+
<h3 style='color: #6A0572;'>π Why is Corpus Important?</h3>
|
| 194 |
+
<ul>
|
| 195 |
+
<li>The <strong>Word2Vec algorithm</strong> is completely dependent on the corpus</li>
|
| 196 |
+
<li>Better corpus β Better word representation</li>
|
| 197 |
+
<li>It <strong>preserves semantic meaning</strong> using neighborhood words (context)</li>
|
| 198 |
+
</ul>
|
| 199 |
+
</div>
|
| 200 |
+
""",
|
| 201 |
+
unsafe_allow_html=True,
|
| 202 |
+
)
|
| 203 |
+
st.markdown(
|
| 204 |
+
"""
|
| 205 |
+
<div class='formula'>
|
| 206 |
+
<strong>Word2Vec understands words by their meaning, not just their presence! π</strong>
|
| 207 |
+
</div>
|
| 208 |
+
""",
|
| 209 |
+
unsafe_allow_html=True,
|
| 210 |
+
)
|