| import streamlit as st | |
| st.markdown(""" | |
| <style> | |
| /* Set a soft background color */ | |
| body { | |
| background-color: #eef2f7; | |
| } | |
| /* Style for main title */ | |
| h1 { | |
| color: black; | |
| font-family: 'Roboto', sans-serif; | |
| font-weight: 700; | |
| text-align: center; | |
| margin-bottom: 25px; | |
| } | |
| /* Style for headers */ | |
| h2 { | |
| color: black; | |
| font-family: 'Roboto', sans-serif; | |
| font-weight: 600; | |
| margin-top: 30px; | |
| } | |
| /* Style for subheaders */ | |
| h3 { | |
| color: red; | |
| font-family: 'Roboto', sans-serif; | |
| font-weight: 500; | |
| margin-top: 20px; | |
| } | |
| .custom-subheader { | |
| color: black; | |
| font-family: 'Roboto', sans-serif; | |
| font-weight: 600; | |
| margin-bottom: 15px; | |
| } | |
| /* Paragraph styling */ | |
| p { | |
| font-family: 'Georgia', serif; | |
| line-height: 1.8; | |
| color: black; | |
| margin-bottom: 20px; | |
| } | |
| /* List styling with checkmark bullets */ | |
| .icon-bullet { | |
| list-style-type: none; | |
| padding-left: 20px; | |
| } | |
| .icon-bullet li { | |
| font-family: 'Georgia', serif; | |
| font-size: 1.1em; | |
| margin-bottom: 10px; | |
| color: black; | |
| } | |
| .icon-bullet li::before { | |
| content: "◆"; | |
| padding-right: 10px; | |
| color: black; | |
| } | |
| /* Sidebar styling */ | |
| .sidebar .sidebar-content { | |
| background-color: #ffffff; | |
| border-radius: 10px; | |
| padding: 15px; | |
| } | |
| .sidebar h2 { | |
| color: #495057; | |
| } | |
| .step-box { | |
| font-size: 18px; | |
| background-color: #F0F8FF; | |
| padding: 15px; | |
| border-radius: 10px; | |
| box-shadow: 2px 2px 8px #D3D3D3; | |
| line-height: 1.6; | |
| } | |
| .box { | |
| font-size: 18px; | |
| background-color: #F0F8FF; | |
| padding: 15px; | |
| border-radius: 10px; | |
| box-shadow: 2px 2px 8px #D3D3D3; | |
| line-height: 1.6; | |
| } | |
| .title { | |
| font-size: 26px; | |
| font-weight: bold; | |
| color: #E63946; | |
| text-align: center; | |
| margin-bottom: 15px; | |
| } | |
| .formula { | |
| font-size: 20px; | |
| font-weight: bold; | |
| color: #2A9D8F; | |
| background-color: #F7F7F7; | |
| padding: 10px; | |
| border-radius: 5px; | |
| text-align: center; | |
| margin-top: 10px; | |
| } | |
| /* Custom button style */ | |
| .streamlit-button { | |
| background-color: #00FFFF; | |
| color: #000000; | |
| font-weight: bold; | |
| } | |
| </style> | |
| """, unsafe_allow_html=True) | |
| st.header("Vectorization🧭") | |
| st.markdown( | |
| """ | |
| <div class='info-box'> | |
| <p>Vectorization is the process of converting text into vector.</p> | |
| <p>This allows ML models to process text data effectively.</p> | |
| </div> | |
| """, | |
| unsafe_allow_html=True | |
| ) | |
| st.markdown(""" | |
| There are advance vectorization techniques.They are : | |
| <ul class="icon-bullet"> | |
| <li>Word Embedding </li> | |
| <li>Word2Vec </li> | |
| <li>Fasttext</li> | |
| </ul> | |
| """, unsafe_allow_html=True) | |
| st.sidebar.title("Navigation 🧭") | |
| file_type = st.sidebar.radio( | |
| "Choose a Vectorization technique :", | |
| ("Word2Vec", "Fasttext")) | |
| st.header("Word Embedding Technique") | |
| st.markdown(''' | |
| - It is a advanced vectorization technique it converts text into vectors in such a way that it preserves semantic meaning | |
| - All the techniques which preserves semantic meaning while converting text into vector is word embedding technique | |
| - There are 2 word embedding techniques: | |
| - Word2Vec | |
| - Fasttext | |
| ''') | |
| if file_type == "Word2Vec": | |
| st.title(":red[Word2Vec]") | |
| st.markdown( | |
| """ | |
| <div class='box'> | |
| <h3 style='color: #6A0572;'>📌 How Word2Vec Works?</h3> | |
| <ul> | |
| <li>After <strong>training</strong>, we obtain the final <span class='highlight'>Word2Vec model</span></li> | |
| <li>The model stores a <strong>dictionary</strong> with word-vector pairs:</li> | |
| </ul> | |
| <pre style="background-color:#F7F7F7; padding: 10px; border-radius: 5px;"> | |
| { w1: [v1], w2: [v2], w3: [v3] } | |
| </pre> | |
| </div> | |
| """, | |
| unsafe_allow_html=True, | |
| ) | |
| st.markdown( | |
| """ | |
| <div class='box'> | |
| <h3 style='color: #6A0572;'>⚙️ Training vs. Test Time</h3> | |
| <ul> | |
| <li><strong>Training Time</strong>: <span class='highlight'>Corpus + Deep Learning Algorithm</span> → Generates Model</li> | |
| <li><strong>Test Time</strong>: <span class='highlight'>Word</span> → Looked up in Dictionary → Returns <span class='highlight'>Vector Representation</span></li> | |
| </ul> | |
| </div> | |
| """, | |
| unsafe_allow_html=True, | |
| ) | |
| st.markdown( | |
| """ | |
| <h3 style='color: #6A0572;'>🔍 How Does It Preserve Meaning?</h3> | |
| <ul> | |
| <li>It learns from the <strong>context</strong> of words in the <span class='highlight'>corpus</span></li> | |
| <li>When given a word, it checks in the dictionary and retrieves the <strong>semantic vector</strong></li> | |
| <li>Unlike other models, <span class='highlight'>dimensions are not words</span>, but their meanings</li> | |
| </ul> | |
| """, | |
| unsafe_allow_html=True, | |
| ) | |
| st.markdown( | |
| """ | |
| <div class='box'> | |
| <h3 style='color: #6A0572;'>📚 Why is Corpus Important?</h3> | |
| <ul> | |
| <li>The <strong>Word2Vec algorithm</strong> is completely dependent on the corpus</li> | |
| <li>Better corpus → Better word representation</li> | |
| <li>It <strong>preserves semantic meaning</strong> using neighborhood words (context)</li> | |
| </ul> | |
| </div> | |
| """, | |
| unsafe_allow_html=True, | |
| ) | |
| st.markdown(''' | |
| - | |
| ''') | |