satya11 commited on
Commit
ebc016e
·
verified ·
1 Parent(s): 00a0860

Create 7_Advance_vectorization_techniques.py

Browse files
pages/7_Advance_vectorization_techniques.py ADDED
@@ -0,0 +1,559 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+
3
+ st.markdown("""
4
+ <style>
5
+ /* Set a soft background color */
6
+ body {
7
+ background-color: #eef2f7;
8
+ }
9
+ /* Style for main title */
10
+ h1 {
11
+ color: black;
12
+ font-family: 'Roboto', sans-serif;
13
+ font-weight: 700;
14
+ text-align: center;
15
+ margin-bottom: 25px;
16
+ }
17
+ /* Style for headers */
18
+ h2 {
19
+ color: black;
20
+ font-family: 'Roboto', sans-serif;
21
+ font-weight: 600;
22
+ margin-top: 30px;
23
+ }
24
+
25
+ /* Style for subheaders */
26
+ h3 {
27
+ color: red;
28
+ font-family: 'Roboto', sans-serif;
29
+ font-weight: 500;
30
+ margin-top: 20px;
31
+ }
32
+ .custom-subheader {
33
+ color: black;
34
+ font-family: 'Roboto', sans-serif;
35
+ font-weight: 600;
36
+ margin-bottom: 15px;
37
+ }
38
+ /* Paragraph styling */
39
+ p {
40
+ font-family: 'Georgia', serif;
41
+ line-height: 1.8;
42
+ color: white;
43
+ margin-bottom: 20px;
44
+ }
45
+ /* List styling with checkmark bullets */
46
+ .icon-bullet {
47
+ list-style-type: none;
48
+ padding-left: 20px;
49
+ }
50
+ .icon-bullet li {
51
+ font-family: 'Georgia', serif;
52
+ font-size: 1.1em;
53
+ margin-bottom: 10px;
54
+ color: black;
55
+ }
56
+ .icon-bullet li::before {
57
+ content: "◆";
58
+ padding-right: 10px;
59
+ color: black;
60
+ }
61
+ /* Sidebar styling */
62
+ .sidebar .sidebar-content {
63
+ background-color: #ffffff;
64
+ border-radius: 10px;
65
+ padding: 15px;
66
+ }
67
+ .sidebar h2 {
68
+ color: #495057;
69
+ }
70
+ .step-box {
71
+ font-size: 18px;
72
+ background-color: #F0F8FF;
73
+ padding: 15px;
74
+ border-radius: 10px;
75
+ box-shadow: 2px 2px 8px #D3D3D3;
76
+ line-height: 1.6;
77
+ }
78
+ .box {
79
+ font-size: 18px;
80
+ background-color: #F0F8FF;
81
+ padding: 15px;
82
+ border-radius: 10px;
83
+ box-shadow: 2px 2px 8px #D3D3D3;
84
+ line-height: 1.6;
85
+ }
86
+ .title {
87
+ font-size: 26px;
88
+ font-weight: bold;
89
+ color: #E63946;
90
+ text-align: center;
91
+ margin-bottom: 15px;
92
+ }
93
+ .formula {
94
+ font-size: 20px;
95
+ font-weight: bold;
96
+ color: #2A9D8F;
97
+ background-color: #F7F7F7;
98
+ padding: 10px;
99
+ border-radius: 5px;
100
+ text-align: center;
101
+ margin-top: 10px;
102
+ }
103
+ /* Custom button style */
104
+ .streamlit-button {
105
+ background-color: #00FFFF;
106
+ color: #000000;
107
+ font-weight: bold;
108
+ }
109
+ </style>
110
+ """, unsafe_allow_html=True)
111
+
112
+ st.header("Vectorization🧭")
113
+ st.markdown(
114
+ """
115
+ <div class='info-box'>
116
+ <p>Vectorization is the process of converting text into vector.</p>
117
+ <p>This allows ML models to process text data effectively.</p>
118
+ </div>
119
+ """,
120
+ unsafe_allow_html=True
121
+ )
122
+
123
+ st.markdown("""
124
+ There are advance vectorization techniques.They are :
125
+ <ul class="icon-bullet">
126
+ <li>Word Embedding </li>
127
+ <li>Word2Vec </li>
128
+ <li>Fasttext</li>
129
+ </ul>
130
+ """, unsafe_allow_html=True)
131
+
132
+ st.sidebar.title("Navigation 🧭")
133
+ file_type = st.sidebar.radio(
134
+ "Choose a Vectorization technique :",
135
+ ("Word2Vec", "Fasttext"))
136
+
137
+ st.header("Word Embedding Technique")
138
+ st.markdown('''
139
+ - It is a advanced vectorization technique it converts text into vectors in such a way that it preserves semantic meaning
140
+ - All the techniques which preserves semantic meaning while converting text into vector is word embedding technique
141
+ - There are 2 word embedding techniques:
142
+ - Word2Vec
143
+ - Fasttext
144
+ ''')
145
+
146
+ if file_type == "Word2Vec":
147
+ st.title(":red[Word2Vec]")
148
+ st.markdown(
149
+ """
150
+ <h3 style='color: #6A0572;'>📌 How Word2Vec Works?</h3>
151
+ <ul>
152
+ <li>After <strong>training</strong>, we obtain the final <span class='highlight'>Word2Vec model</span></li>
153
+ <li>The model stores a <strong>dictionary</strong> with word-vector pairs:</li>
154
+ </ul>
155
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
156
+ { w1: [v1], w2: [v2], w3: [v3] }
157
+ </pre>
158
+ """,
159
+ unsafe_allow_html=True,
160
+ )
161
+ st.markdown(
162
+ """
163
+ <h3 style='color: #6A0572;'>⚙️ Training vs. Test Time</h3>
164
+ <ul>
165
+ <li><strong>Training Time</strong>: <span class='highlight'>Corpus + Deep Learning Algorithm</span> → Generates Model</li>
166
+ <li><strong>Test Time</strong>: <span class='highlight'>Word</span> → Looked up in Dictionary → Returns <span class='highlight'>Vector Representation</span></li>
167
+ </ul>
168
+ """,
169
+ unsafe_allow_html=True,
170
+ )
171
+
172
+ st.markdown(
173
+ """
174
+ <h3 style='color: #6A0572;'>🔍 How Does It Preserve Meaning?</h3>
175
+ <ul>
176
+ <li>It learns from the <strong>context</strong> of words in the <span class='highlight'>corpus</span></li>
177
+ <li>When given a word, it checks in the dictionary and retrieves the <strong>semantic vector</strong></li>
178
+ <li>Unlike other models, <span class='highlight'>dimensions are not words</span>, but their meanings</li>
179
+ </ul>
180
+ """,
181
+ unsafe_allow_html=True,
182
+ )
183
+
184
+ st.markdown(
185
+ """
186
+ <h3 style='color: #6A0572;'>📚 Why is Corpus Important?</h3>
187
+ <ul>
188
+ <li>The <strong>Word2Vec algorithm</strong> is completely dependent on the corpus</li>
189
+ <li>Better corpus → Better word representation</li>
190
+ <li>It <strong>preserves semantic meaning</strong> using neighborhood words (context)</li>
191
+ </ul>
192
+ """,
193
+ unsafe_allow_html=True,
194
+ )
195
+ st.markdown('''
196
+ - Word2Vec is not converting document into vector, it is converting word to vector
197
+ - There are 2 techniques by using which we can convert entire document into vector
198
+ - They are :
199
+ - Average Word2Vec
200
+ - TIF-IDF Word2Vec
201
+ ''')
202
+
203
+ st.subheader(":blue[Average Word2Vec]")
204
+ st.markdown(
205
+ """
206
+ <h3 style='color: #6A0572;'>📌 Step-by-Step Process</h3>
207
+ <ul>
208
+ <li>Given a document <span class='highlight'>d1</span>: <strong>w1, w2, w3</strong></li>
209
+ <li>Retrieve vector representations <strong>v1, v2, v3</strong> from Word2Vec</li>
210
+ <li>Perform <span class='highlight'>element-wise addition</span> of vectors:
211
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
212
+ v_total = v1 + v2 + v3
213
+ </pre>
214
+ </li>
215
+ <li>Normalize by dividing by the total number of words (element-wise division):
216
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
217
+ v_avg = v_total / len(d1)
218
+ </pre>
219
+ </li>
220
+ <li>Final representation contains the <span class='highlight'>average meaning</span> of all words</li>
221
+ </ul>
222
+ """,
223
+ unsafe_allow_html=True,
224
+ )
225
+
226
+ st.markdown(
227
+ """
228
+ <h3 style='color: #6A0572;'>⚠️ Problem: Equal Importance to Every Word</h3>
229
+ <ul>
230
+ <li>Word2Vec assigns <span class='highlight'>equal weight</span> to all words</li>
231
+ <li>No emphasis on <strong>important words</strong> that carry significant meaning</li>
232
+ <li>This limits the effectiveness in understanding <span class='highlight'>word importance</span></li>
233
+ </ul>
234
+ """,
235
+ unsafe_allow_html=True,
236
+ )
237
+
238
+ st.markdown(
239
+ """
240
+ <strong>Word2Vec averages word meanings, but lacks weightage for important words! </strong>
241
+ """,
242
+ unsafe_allow_html=True,
243
+ )
244
+
245
+ st.subheader(":blue[TF-IDF Word2Vec]")
246
+ st.markdown(
247
+ """
248
+ <h3 style='color: #6A0572;'>⚠️ Issue with Word2Vec</h3>
249
+ <ul>
250
+ <li>Gives equal importance to every word</li>
251
+ <li>Even words that appear frequently in a document but rarely in the corpus get equal weight</li>
252
+ </ul>
253
+ """,
254
+ unsafe_allow_html=True,
255
+ )
256
+
257
+ st.markdown(
258
+ """
259
+ <h3 style='color: #6A0572;'>🚀 Solution: Adding Weightage</h3>
260
+ <ul>
261
+ <li>Consider a document with 3 words: <strong>w1, w2, w3</strong></li>
262
+ <li>Each word has a vector representation:
263
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
264
+ w1 → v1, w2 → v2, w3 → v3
265
+ </pre>
266
+ </li>
267
+ <li>We use <span class='highlight'>two models</span>:
268
+ <ul>
269
+ <li><strong>TF-IDF</strong> → Computes weightage for each word</li>
270
+ <li><strong>Word2Vec</strong> → Converts words into vectors</li>
271
+ </ul>
272
+ </li>
273
+ <li>For each word, multiply its TF-IDF value with its vector</li>
274
+ </ul>
275
+ """,
276
+ unsafe_allow_html=True,
277
+ )
278
+
279
+ st.markdown(
280
+ """
281
+ <strong>Final Weighted Representation:</strong>
282
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
283
+ v_final = (TF-IDF(w1) * v1 + TF-IDF(w2) * v2 + TF-IDF(w3) * v3)
284
+ / (TF-IDF(w1) + TF-IDF(w2) + TF-IDF(w3))
285
+ </pre>
286
+ """,
287
+ unsafe_allow_html=True,
288
+ )
289
+ st.subheader("How to train our own W2V model")
290
+ st.markdown('''
291
+ - At training time Corpus + W2V algorithm can be implemented by 2 techniques
292
+ - They are:
293
+ - Skip-gram
294
+ - CBOW
295
+ ''')
296
+
297
+ st.subheader(":red[CBOW]")
298
+ st.markdown(
299
+ """
300
+ <div class='box'>
301
+ <h3 style='color: #6A0572;'>What is CBOW?</h3>
302
+ <p><strong>CBOW (Continuous Bag of Words)</strong> is a technique where we use surrounding words (context) to predict the target word (focus word).</p>
303
+ </div>
304
+ """,
305
+ unsafe_allow_html=True,
306
+ )
307
+ st.markdown(
308
+ """
309
+ <h3 style='color: #6A0572;'>📂 Example Corpus</h3>
310
+ <ul>
311
+ <li><strong>d1:</strong> w1, w2, w3, w4, w5, w4</li>
312
+ <li><strong>d2:</strong> w3, w4, w5, w2, w1, w2, w3, w4</li>
313
+ </ul>
314
+ <p>We first preprocess the data to extract meaningful relationships.</p>
315
+ """,
316
+ unsafe_allow_html=True,
317
+ )
318
+
319
+ st.markdown(
320
+ """
321
+ <h3 style='color: #6A0572;'>📌 Steps to Process the Data</h3>
322
+ <ul>
323
+ <li>Create a <span class='highlight'>vocabulary</span> from the entire corpus: <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">{w1, w2, w3, w4, w5}</pre></li>
324
+ <li>Generate a <strong>tabular dataset</strong> with:
325
+ <ul>
326
+ <li><strong>Feature variables (Context Words)</strong></li>
327
+ <li><strong>Class variables (Target Words)</strong></li>
328
+ </ul>
329
+ </li>
330
+ <li>Apply a <span class='highlight'>window size</span> of 2 (how many neighbors we consider).</li>
331
+ <li>Slide the window over the text with <span class='highlight'>slide = 1</span>.</li>
332
+ </ul>
333
+ """,
334
+ unsafe_allow_html=True,
335
+ )
336
+
337
+ st.markdown(
338
+ """
339
+ <h3 style='color: #6A0572;'> Handling Variable Context Length</h3>
340
+ <ul>
341
+ <li>To ensure a consistent feature length, we use <strong>zero-padding</strong> when needed.</li>
342
+ <li>The model tries to understand relationships based on the surrounding <span class='highlight'>context words</span>.</li>
343
+ </ul>
344
+ """,
345
+ unsafe_allow_html=True,
346
+ )
347
+ st.markdown(
348
+ """
349
+ <strong>Mathematical Representation:</strong>
350
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
351
+ y = f(xi)
352
+ where,
353
+ y = Focus Word (Target)
354
+ xi = Context Words (Neighbors)
355
+ </pre>
356
+ """,
357
+ unsafe_allow_html=True,
358
+ )
359
+
360
+ st.markdown(
361
+ """
362
+ <h3 style='color: #6A0572;'> Training with Artificial Neural Networks</h3>
363
+ <p>The tabular data is passed to an <strong>Artificial Neural Network (ANN)</strong> which learns:</p>
364
+ <ul>
365
+ <li>How <span class='highlight'>context words</span> are related to <span class='highlight'>focus words</span>.</li>
366
+ </ul>
367
+ """,
368
+ unsafe_allow_html=True,
369
+ )
370
+
371
+ st.subheader(":red[Skipgram]")
372
+ st.markdown(
373
+ """
374
+ <div class='box'>
375
+ <h3 style='color: #6A0572;'>What is Skipgram?</h3>
376
+ <p><strong>Skipgram</strong> is a technique where we use focus words to predict the context words.</p>
377
+ </div>
378
+ """,
379
+ unsafe_allow_html=True,
380
+ )
381
+
382
+ st.markdown(
383
+ """
384
+ <h3 style='color: #6A0572;'>📂 Example Corpus</h3>
385
+ <ul>
386
+ <li><strong>d1:</strong> w1, w2, w3, w4, w5, w4</li>
387
+ <li><strong>d2:</strong> w3, w4, w5, w2, w1, w2, w3, w4</li>
388
+ </ul>
389
+ <p>We first preprocess the data to extract meaningful relationships.</p>
390
+ """,
391
+ unsafe_allow_html=True,
392
+ )
393
+
394
+ st.markdown(
395
+ """
396
+ <h3 style='color: #6A0572;'>📌 Steps to Process the Data</h3>
397
+ <ul>
398
+ <li>Create a <span class='highlight'>vocabulary</span> from the entire corpus: <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">{w1, w2, w3, w4, w5}</pre></li>
399
+ <li>Generate a <strong>tabular dataset</strong> with:
400
+ <ul>
401
+ <li><strong>Feature variables (Focus Words)</strong></li>
402
+ <li><strong>Class variables (Context Words)</strong></li>
403
+ </ul>
404
+ </li>
405
+ <li>Apply a <span class='highlight'>window size</span> of 2 (how many neighbors we consider).</li>
406
+ <li>Slide the window over the text with <span class='highlight'>slide = 1</span>.</li>
407
+ </ul>
408
+ """,
409
+ unsafe_allow_html=True,
410
+ )
411
+
412
+ st.markdown(
413
+ """
414
+ <h3 style='color: #6A0572;'> Handling Variable Context Length</h3>
415
+ <ul>
416
+ <li>To ensure a consistent feature length, we use <strong>zero-padding</strong> when needed.</li>
417
+ <li>The model tries to understand relationships<span class='highlight'>focus words</span>.</li>
418
+ </ul>
419
+ """,
420
+ unsafe_allow_html=True,
421
+ )
422
+
423
+ st.markdown(
424
+ """
425
+ <strong>Mathematical Representation:</strong>
426
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
427
+ y = f(xi)
428
+ where,
429
+ y = Context Word
430
+ xi = Focus Words
431
+ </pre>
432
+ """,
433
+ unsafe_allow_html=True,
434
+ )
435
+
436
+ st.markdown(
437
+ """
438
+ <h3 style='color: #6A0572;'> Training with Artificial Neural Networks</h3>
439
+ <p>The tabular data is passed to an <strong>Artificial Neural Network (ANN)</strong> which learns:</p>
440
+ <ul>
441
+ <li>How <span class='highlight'>focus words</span> are related with <span class='highlight'>context words</span>.</li>
442
+ </ul>
443
+ """,
444
+ unsafe_allow_html=True,
445
+ )
446
+
447
+
448
+ elif file_type == "Fasttext":
449
+ st.title(":red[Fasttext]")
450
+ st.markdown(
451
+ """
452
+ <p><strong>FastText</strong> is an advanced word vectorization technique that enhances word embeddings by considering subword information.</p>
453
+ <p>It is a <span class='highlight'>simple extension</span> of Word2Vec, which converts words into vectors.</p>
454
+ """,
455
+ unsafe_allow_html=True,
456
+ )
457
+
458
+ st.markdown(
459
+ """
460
+ <h3 style='color: #6A0572;'> Implementing FastText</h3>
461
+ <p>FastText can be implemented using:</p>
462
+ <ul>
463
+ <li><strong>CBOW (Continuous Bag of Words)</strong></li>
464
+ <li><strong>Skip-gram</strong></li>
465
+ </ul>
466
+ """,
467
+ unsafe_allow_html=True,
468
+ )
469
+
470
+ st.markdown(
471
+ """
472
+ <strong>CBOW Representation:</strong>
473
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
474
+ y = f(xi)
475
+ where,
476
+ y = Focus Word
477
+ xi = Context Words
478
+ </pre>
479
+ <strong>Skip-gram Representation:</strong>
480
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
481
+ y = f(xi)
482
+ where,
483
+ y = Context Words
484
+ xi = Focus Word
485
+ </pre>
486
+ """,
487
+ unsafe_allow_html=True,
488
+ )
489
+
490
+ st.markdown(
491
+ """
492
+ <h3 style='color: #6A0572;'> Problem: Out-of-Vocabulary (OOV)</h3>
493
+ <p>Traditional word embedding techniques fail when encountering new or rare words.</p>
494
+ <p><span class='highlight'>FastText overcomes this issue</span> by breaking words into subword units (character n-grams).</p>
495
+ """,
496
+ unsafe_allow_html=True,
497
+ )
498
+
499
+ st.markdown(
500
+ """
501
+ <h3 style='color: #6A0572;'>Implementing CBOW with Character N-Grams</h3>
502
+ <ul>
503
+ <li><span class='highlight'>Window Size</span>: 5</li>
504
+ <li><span class='highlight'>Window</span>: 2</li>
505
+ <li><span class='highlight'>Slide</span>: 1</li>
506
+ </ul>
507
+ <p>A tabular format is created with <strong>context words</strong> and <strong>focus words</strong>.</p>
508
+ """,
509
+ unsafe_allow_html=True,
510
+ )
511
+ st.markdown(
512
+ """
513
+ ## Example Sentences:
514
+ - **d1:** "apple is good for health"
515
+ - **d2:** "biryani is not good for health"
516
+
517
+ This application creates a table for **context words** and **focus words** using **character 2-grams**.
518
+ """
519
+ )
520
+
521
+ st.markdown('''
522
+ -Character 2-Gram Table:
523
+
524
+ - "Context Words": ["ap", "pp", "pl", "le", "is"]
525
+
526
+ - "Focus Words": ["go", "oo", "od"]
527
+ ''')
528
+
529
+ st.markdown(
530
+ """
531
+ - This representation provides an **average 2D vector** for words.
532
+ """
533
+ )
534
+
535
+ st.markdown(
536
+ """
537
+ <h3 style='color: #6A0572;'>Vocabulary</h3>
538
+ <p>The vocabulary consists of <span class='highlight'>unique character n-grams</span>.</p>
539
+ <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;">
540
+ { keys: values }
541
+ where,
542
+ - Keys: Character n-grams
543
+ - Values: Vector representations
544
+ </pre>
545
+ """,
546
+ unsafe_allow_html=True,
547
+ )
548
+
549
+ st.markdown(
550
+ """
551
+ <h3 style='color: #6A0572;'> FastText Model</h3>
552
+ <ul>
553
+ <li>The dictionary created is the <span class='highlight'>FastText model</span>.</li>
554
+ <li>Text is broken down into <strong>character n-grams</strong> to generate vector representations.</li>
555
+ <li>It follows <span class='highlight'>element-wise addition</span>, giving an <strong>average 2D representation</strong> of the word.</li>
556
+ </ul>
557
+ """,
558
+ unsafe_allow_html=True,
559
+ )