Daksh0505 commited on
Commit
bc44d3c
·
verified ·
1 Parent(s): 49c37e5

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +95 -90
app.py CHANGED
@@ -48,96 +48,6 @@ with st.expander("🎯 Purpose"):
48
  - Exploring how encoder outputs can serve as **context embeddings** for downstream NLP tasks
49
  """)
50
 
51
- # ------------------------------------------------
52
- # Self Attention Section
53
- # ------------------------------------------------
54
- with st.expander("🔹 Self-Attention Mechanism"):
55
- st.markdown("""
56
- Self-Attention is a mechanism where each token in a sequence attends to **other tokens in the same sequence** to capture dependencies.
57
-
58
- **Key points:**
59
- - Helps the model focus on relevant words within the same sentence.
60
- - Computes attention scores between all pairs of positions in the input.
61
- - Often implemented as **Multi-Head Self-Attention** to capture different types of relationships simultaneously.
62
-
63
- **Example:**
64
- In the sentence *"The cat sat on the mat"*, self-attention allows the model to understand that *"cat"* is related to *"sat"* and *"mat"*.
65
- """)
66
-
67
- # ------------------------------------------------
68
- # Cross Attention Section
69
- # ------------------------------------------------
70
- with st.expander("🔹 Cross-Attention Mechanism"):
71
- st.markdown("""
72
- Cross-Attention is used in encoder-decoder architectures where the **decoder attends to encoder outputs**.
73
-
74
- **Key points:**
75
- - Decoder queries encoder outputs to focus on relevant parts of the input sentence.
76
- - Crucial for translation, summarization, or any sequence-to-sequence task.
77
-
78
- **Example:**
79
- Translating *"I am hungry"* to Hindi: when generating the Hindi word *"भूखा"*, cross-attention helps the decoder focus on *"hungry"* in the English input.
80
- """)
81
-
82
- # ------------------------------------------------
83
- # Multi-Head Attention Section
84
- # ------------------------------------------------
85
- with st.expander("🔹 Multi-Head Attention"):
86
- st.markdown("""
87
- Multi-Head Attention is an extension of the attention mechanism that allows the model to **capture information from different representation subspaces simultaneously**.
88
-
89
- **Key Points:**
90
- - Instead of using a single attention function, we use **multiple attention heads**.
91
- - Each head learns to focus on **different parts or relationships** of the input.
92
- - The outputs from all heads are **concatenated and linearly projected** to form the final context vector.
93
- - Improves the model’s ability to understand complex dependencies in sequences.
94
-
95
- **Example:**
96
- - In translating *"The cat sat on the mat"*:
97
- - Head 1 may focus on subject-verb relations (*cat ↔ sat*).
98
- - Head 2 may focus on verb-object relations (*sat ↔ mat*).
99
- - Head 3 may focus on positional or syntactic patterns.
100
- - Combining all heads gives a richer context for the decoder.
101
-
102
- **In your Seq2Seq Model:**
103
- - Multi-Head Attention can be used as:
104
- - **Self-Attention** in encoder/decoder layers
105
- - **Cross-Attention** between encoder outputs and decoder hidden states
106
- """)
107
-
108
- # ------------------------------------------------
109
- # Seq2Seq task Explaining Section
110
- # ------------------------------------------------
111
- with st.expander("🔹 Sequence-to-Sequence (Seq2Seq) Task"):
112
- st.markdown("""
113
- Seq2Seq models map an **input sequence** to an **output sequence**, often with **different lengths**.
114
-
115
- **Examples:**
116
- - Machine Translation: English → Hindi
117
- - Text Summarization
118
- - Chatbots / Dialogue Systems
119
-
120
- **Characteristics:**
121
- - Handles variable-length input and output sequences.
122
- - Uses encoder to process input, decoder to generate output.
123
- - Can integrate attention mechanisms to improve alignment between input and output tokens.
124
- """)
125
-
126
- # ------------------------------------------------
127
- # Seq2Seq Task- Fixed-Length vs Variable-Length Section
128
- # ------------------------------------------------
129
- with st.expander("🔹 Fixed-Length vs Variable-Length Tasks"):
130
- st.markdown("""
131
- **Fixed-Length Tasks:**
132
- - Input and output sequences have the **same length**.
133
- - Example: Time series forecasting with fixed steps, classification tasks.
134
-
135
- **Variable-Length Tasks:**
136
- - Input and output sequences can **differ in length**.
137
- - Example: Machine translation, summarization, speech recognition.
138
- - Seq2Seq models are designed to handle this flexibility.
139
- """)
140
-
141
  # ------------------------------------------------
142
  # Load models and tokenizers
143
  # ------------------------------------------------
@@ -308,6 +218,101 @@ if st.button("🚀 Translate"):
308
  if st.session_state.translation:
309
  st.success(f"✅ **Predicted Hindi Translation:** {st.session_state.translation}")
310
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
311
  # ------------------------------------------------
312
  # Show model architecture
313
  # ------------------------------------------------
 
48
  - Exploring how encoder outputs can serve as **context embeddings** for downstream NLP tasks
49
  """)
50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  # ------------------------------------------------
52
  # Load models and tokenizers
53
  # ------------------------------------------------
 
218
  if st.session_state.translation:
219
  st.success(f"✅ **Predicted Hindi Translation:** {st.session_state.translation}")
220
 
221
+ # ------------------------------------------------
222
+ # Learning Header
223
+ # ------------------------------------------------
224
+ st.subheader("Leaning how it works")
225
+
226
+ # ------------------------------------------------
227
+ # Self Attention Section
228
+ # ------------------------------------------------
229
+ with st.expander("🔹 Self-Attention Mechanism"):
230
+ st.markdown("""
231
+ Self-Attention is a mechanism where each token in a sequence attends to **other tokens in the same sequence** to capture dependencies.
232
+
233
+ **Key points:**
234
+ - Helps the model focus on relevant words within the same sentence.
235
+ - Computes attention scores between all pairs of positions in the input.
236
+ - Often implemented as **Multi-Head Self-Attention** to capture different types of relationships simultaneously.
237
+
238
+ **Example:**
239
+ In the sentence *"The cat sat on the mat"*, self-attention allows the model to understand that *"cat"* is related to *"sat"* and *"mat"*.
240
+ """)
241
+
242
+ # ------------------------------------------------
243
+ # Cross Attention Section
244
+ # ------------------------------------------------
245
+ with st.expander("🔹 Cross-Attention Mechanism"):
246
+ st.markdown("""
247
+ Cross-Attention is used in encoder-decoder architectures where the **decoder attends to encoder outputs**.
248
+
249
+ **Key points:**
250
+ - Decoder queries encoder outputs to focus on relevant parts of the input sentence.
251
+ - Crucial for translation, summarization, or any sequence-to-sequence task.
252
+
253
+ **Example:**
254
+ Translating *"I am hungry"* to Hindi: when generating the Hindi word *"भूखा"*, cross-attention helps the decoder focus on *"hungry"* in the English input.
255
+ """)
256
+
257
+ # ------------------------------------------------
258
+ # Multi-Head Attention Section
259
+ # ------------------------------------------------
260
+ with st.expander("🔹 Multi-Head Attention"):
261
+ st.markdown("""
262
+ Multi-Head Attention is an extension of the attention mechanism that allows the model to **capture information from different representation subspaces simultaneously**.
263
+
264
+ **Key Points:**
265
+ - Instead of using a single attention function, we use **multiple attention heads**.
266
+ - Each head learns to focus on **different parts or relationships** of the input.
267
+ - The outputs from all heads are **concatenated and linearly projected** to form the final context vector.
268
+ - Improves the model’s ability to understand complex dependencies in sequences.
269
+
270
+ **Example:**
271
+ - In translating *"The cat sat on the mat"*:
272
+ - Head 1 may focus on subject-verb relations (*cat ↔ sat*).
273
+ - Head 2 may focus on verb-object relations (*sat ↔ mat*).
274
+ - Head 3 may focus on positional or syntactic patterns.
275
+ - Combining all heads gives a richer context for the decoder.
276
+
277
+ **In your Seq2Seq Model:**
278
+ - Multi-Head Attention can be used as:
279
+ - **Self-Attention** in encoder/decoder layers
280
+ - **Cross-Attention** between encoder outputs and decoder hidden states
281
+ """)
282
+
283
+ # ------------------------------------------------
284
+ # Seq2Seq task Explaining Section
285
+ # ------------------------------------------------
286
+ with st.expander("🔹 Sequence-to-Sequence (Seq2Seq) Task"):
287
+ st.markdown("""
288
+ Seq2Seq models map an **input sequence** to an **output sequence**, often with **different lengths**.
289
+
290
+ **Examples:**
291
+ - Machine Translation: English → Hindi
292
+ - Text Summarization
293
+ - Chatbots / Dialogue Systems
294
+
295
+ **Characteristics:**
296
+ - Handles variable-length input and output sequences.
297
+ - Uses encoder to process input, decoder to generate output.
298
+ - Can integrate attention mechanisms to improve alignment between input and output tokens.
299
+ """)
300
+
301
+ # ------------------------------------------------
302
+ # Seq2Seq Task- Fixed-Length vs Variable-Length Section
303
+ # ------------------------------------------------
304
+ with st.expander("🔹 Fixed-Length vs Variable-Length Tasks"):
305
+ st.markdown("""
306
+ **Fixed-Length Tasks:**
307
+ - Input and output sequences have the **same length**.
308
+ - Example: Time series forecasting with fixed steps, classification tasks.
309
+
310
+ **Variable-Length Tasks:**
311
+ - Input and output sequences can **differ in length**.
312
+ - Example: Machine translation, summarization, speech recognition.
313
+ - Seq2Seq models are designed to handle this flexibility.
314
+ """)
315
+
316
  # ------------------------------------------------
317
  # Show model architecture
318
  # ------------------------------------------------