Harika22 commited on
Commit
39864a0
·
verified ·
1 Parent(s): e92ee8b

Update pages/6_Feature_Engineering.py

Browse files
Files changed (1) hide show
  1. pages/6_Feature_Engineering.py +18 -0
pages/6_Feature_Engineering.py CHANGED
@@ -153,5 +153,23 @@ if file_type == "One-Hot Vectorization":
153
  - This technique is called One-Hot Vectorization
154
  ''')
155
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156
 
157
 
 
153
  - This technique is called One-Hot Vectorization
154
  ''')
155
 
156
+ st.markdown('''Example for One-Hot Vectorization is :
157
+ - There is a corpus contains 3 documents d1, d2, d3
158
+ - d1 ➡️ Toy is good
159
+ - d2 ➡️ Toy is not good
160
+ - d3 ➡️ Bad toy
161
+ - It converts d1 into v1 where (v1 is numerical representation of d1)
162
+ - It converts d2 into v2 where (v2 is numerical representation of d2)
163
+ - It converts d3 into v3 where (v3 is numerical representation of d3)
164
+ - Creates a vocabulary ➡️ {toy, is, good, not, bad }
165
+ - len(vocavulary) = 5 in 5 dimension
166
+ - Each word is represented as 5-dim where every dimension belongs to unique word
167
+ - toy ➡️ [1,0,0,0,0] , is ➡️ [0,1,0,0,0] , good ➡️ [0,0,1,0,0] , not ➡️ [0,0,0,1,0] , bad ➡️ [0,0,0,0,1]
168
+ - d1 → v1 → [[1,0,0,0,0] , [0,1,0,0,0] , [0,0,1,0,0]]
169
+ - d2 → v2 → [[1,0,0,0,0] , [0,1,0,0,0] , [0,0,0,1,0] , [0,0,1,0,0]]
170
+ - d3 → v3 → [[0,0,0,0,1], [1,0,0,0,0]]
171
+ - Here we're converting each and every word into vector form and combining it to form vector this technique is known as **One-Hot Vectorization**
172
+ ''')
173
+
174
 
175