Spaces:

Harika22
/

Natural_Language_Processing

Sleeping

App Files Files Community

Harika22 commited on Feb 1, 2025

Commit

dfceb96

verified ·

1 Parent(s): 9408831

Update pages/6_Feature_Engineering.py

Browse files

Files changed (1) hide show

pages/6_Feature_Engineering.py +16 -18

pages/6_Feature_Engineering.py CHANGED Viewed

@@ -149,21 +149,20 @@ if file_type == "One-Hot Vectorization":
     st.markdown("""
         ### 🛠️ Steps in One-Hot Vectorization:
-        1️⃣ Create a Vocabulary ➡️ (A set of all unique words in the collected corpus).
-        2️⃣ Find the Length of Vocabulary ➡️ (Total number of unique words = d-dimensions).
-        3️⃣ Convert Each Word into a Vector:
-           - 📌 Every unique word is transformed into a vector.
-           - 📌 Each vector has d-dimensions, where each dimension corresponds to a unique word.
-           - 📌 Words are converted individually, and then combined to form a vector.
-        ✅ This technique ensures that each word is treated uniquely and efficiently in NLP tasks.
         """)
     st.markdown("""
-        ### 🎯 Key Takeaways:
-        - 🎯 Each word gets a unique vector representation.
-        - 🎯 The number of dimensions = total vocabulary size.
-        - 🎯 Words are vectorized separately, then combined into document vectors.
     """)
     st.markdown("""
@@ -177,19 +176,18 @@ if file_type == "One-Hot Vectorization":
     """, unsafe_allow_html=True)
     st.markdown("""
-        ### 📝 Document Representations:
         - d₁ → v₁ → `[[1,0,0,0,0] , [0,1,0,0,0] , [0,0,1,0,0]]`
         - d₂ → v₂ → `[[1,0,0,0,0] , [0,1,0,0,0] , [0,0,0,1,0] , [0,0,1,0,0]]`
         - d₃ → v₃ → `[[0,0,0,0,1], [1,0,0,0,0]]`
-    ✅ This **One-Hot Vectorization** technique **converts words into numerical vectors** while preserving their uniqueness.
     """)
     st.markdown("""
-        ### 🎯 Key Takeaways:
-        - 🔹 **Each word** is represented as a **5-dimensional** vector.
-        - 🔹 **Every dimension** corresponds to a **unique word** in the vocabulary.
-        - 🔹 This method is **useful** for transforming text into a **numerical format** for Machine Learning tasks.
     """)

     st.markdown("""
         ### 🛠️ Steps in One-Hot Vectorization:
+         - Create a Vocabulary ➡️ (A set of all unique words in the collected corpus).
+         - Find the Length of Vocabulary ➡️ (Total number of unique words = d-dimensions).
+         - Convert Each Word into a Vector:
+           -  Every unique word is transformed into a vector.
+           -  Each vector has d-dimensions, where each dimension corresponds to a unique word.
+           -  Words are converted individually, and then combined to form a vector.
+         This technique ensures that each word is treated uniquely and efficiently in NLP tasks.
         """)
     st.markdown("""
+        -  Each word gets a unique vector representation.
+        -  The number of dimensions = total vocabulary size.
+        -  Words are vectorized separately, then combined into document vectors.
     """)
     st.markdown("""
     """, unsafe_allow_html=True)
     st.markdown("""
         - d₁ → v₁ → `[[1,0,0,0,0] , [0,1,0,0,0] , [0,0,1,0,0]]`
         - d₂ → v₂ → `[[1,0,0,0,0] , [0,1,0,0,0] , [0,0,0,1,0] , [0,0,1,0,0]]`
         - d₃ → v₃ → `[[0,0,0,0,1], [1,0,0,0,0]]`
+     This One-Hot Vectorization technique converts words into numerical vectors while preserving their uniqueness.
     """)
     st.markdown("""
+        ###  Key Takeaways:
+        -  Each word is represented as a 5-dimensional vector.
+        -  Every dimension corresponds to a unique word in the vocabulary.
+        -  This method is useful for transforming text into a numerical format for Machine Learning tasks.
     """)