Daksh0505 commited on
Commit
3938dc6
·
verified ·
1 Parent(s): 571c11e

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +90 -0
app.py CHANGED
@@ -48,6 +48,96 @@ with st.expander("🎯 Purpose"):
48
  - Exploring how encoder outputs can serve as **context embeddings** for downstream NLP tasks
49
  """)
50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  # ------------------------------------------------
52
  # Load models and tokenizers
53
  # ------------------------------------------------
 
48
  - Exploring how encoder outputs can serve as **context embeddings** for downstream NLP tasks
49
  """)
50
 
51
+ # ------------------------------------------------
52
+ # Self Attention Section
53
+ # ------------------------------------------------
54
+ with st.expander("🔹 Self-Attention Mechanism"):
55
+ st.markdown("""
56
+ Self-Attention is a mechanism where each token in a sequence attends to **other tokens in the same sequence** to capture dependencies.
57
+
58
+ **Key points:**
59
+ - Helps the model focus on relevant words within the same sentence.
60
+ - Computes attention scores between all pairs of positions in the input.
61
+ - Often implemented as **Multi-Head Self-Attention** to capture different types of relationships simultaneously.
62
+
63
+ **Example:**
64
+ In the sentence *"The cat sat on the mat"*, self-attention allows the model to understand that *"cat"* is related to *"sat"* and *"mat"*.
65
+ """)
66
+
67
+ # ------------------------------------------------
68
+ # Cross Attention Section
69
+ # ------------------------------------------------
70
+ with st.expander("🔹 Cross-Attention Mechanism"):
71
+ st.markdown("""
72
+ Cross-Attention is used in encoder-decoder architectures where the **decoder attends to encoder outputs**.
73
+
74
+ **Key points:**
75
+ - Decoder queries encoder outputs to focus on relevant parts of the input sentence.
76
+ - Crucial for translation, summarization, or any sequence-to-sequence task.
77
+
78
+ **Example:**
79
+ Translating *"I am hungry"* to Hindi: when generating the Hindi word *"भूखा"*, cross-attention helps the decoder focus on *"hungry"* in the English input.
80
+ """)
81
+
82
+ # ------------------------------------------------
83
+ # Multi-Head Attention Section
84
+ # ------------------------------------------------
85
+ with st.expander("🔹 Multi-Head Attention"):
86
+ st.markdown("""
87
+ Multi-Head Attention is an extension of the attention mechanism that allows the model to **capture information from different representation subspaces simultaneously**.
88
+
89
+ **Key Points:**
90
+ - Instead of using a single attention function, we use **multiple attention heads**.
91
+ - Each head learns to focus on **different parts or relationships** of the input.
92
+ - The outputs from all heads are **concatenated and linearly projected** to form the final context vector.
93
+ - Improves the model’s ability to understand complex dependencies in sequences.
94
+
95
+ **Example:**
96
+ - In translating *"The cat sat on the mat"*:
97
+ - Head 1 may focus on subject-verb relations (*cat ↔ sat*).
98
+ - Head 2 may focus on verb-object relations (*sat ↔ mat*).
99
+ - Head 3 may focus on positional or syntactic patterns.
100
+ - Combining all heads gives a richer context for the decoder.
101
+
102
+ **In your Seq2Seq Model:**
103
+ - Multi-Head Attention can be used as:
104
+ - **Self-Attention** in encoder/decoder layers
105
+ - **Cross-Attention** between encoder outputs and decoder hidden states
106
+ """)
107
+
108
+ # ------------------------------------------------
109
+ # Seq2Seq task Explaining Section
110
+ # ------------------------------------------------
111
+ with st.expander("🔹 Sequence-to-Sequence (Seq2Seq) Task"):
112
+ st.markdown("""
113
+ Seq2Seq models map an **input sequence** to an **output sequence**, often with **different lengths**.
114
+
115
+ **Examples:**
116
+ - Machine Translation: English → Hindi
117
+ - Text Summarization
118
+ - Chatbots / Dialogue Systems
119
+
120
+ **Characteristics:**
121
+ - Handles variable-length input and output sequences.
122
+ - Uses encoder to process input, decoder to generate output.
123
+ - Can integrate attention mechanisms to improve alignment between input and output tokens.
124
+ """)
125
+
126
+ # ------------------------------------------------
127
+ # Seq2Seq Task- Fixed-Length vs Variable-Length Section
128
+ # ------------------------------------------------
129
+ with st.expander("🔹 Fixed-Length vs Variable-Length Tasks"):
130
+ st.markdown("""
131
+ **Fixed-Length Tasks:**
132
+ - Input and output sequences have the **same length**.
133
+ - Example: Time series forecasting with fixed steps, classification tasks.
134
+
135
+ **Variable-Length Tasks:**
136
+ - Input and output sequences can **differ in length**.
137
+ - Example: Machine translation, summarization, speech recognition.
138
+ - Seq2Seq models are designed to handle this flexibility.
139
+ """)
140
+
141
  # ------------------------------------------------
142
  # Load models and tokenizers
143
  # ------------------------------------------------