Spaces:

Rajesh6
/

NLP

Sleeping

App Files Files Community

Rajesh6 commited on Nov 24, 2024

Commit

f27a908

verified ·

1 Parent(s): 477ce98

Update pages/Introduction.py

Browse files

Files changed (1) hide show

pages/Introduction.py +14 -11

pages/Introduction.py CHANGED Viewed

@@ -14,7 +14,7 @@ st.markdown("<p>NLP powers many applications that use language, such as text tra
 st.subheader("NLP Techniques")
 st.markdown("<p>NLP encompasses a wide array of techniques that aimed at enabling computers to process and understand human language. These tasks can be categorized into several broad areas, each addressing different aspects of language processing. Here are some of the key NLP techniques:</p>",unsafe_allow_html= True)
-st.markdown('<p style="color:yellow;"><b>1. Text Processing and Preprocessing In NLP</b></p>', unsafe_allow_html=True)
 st.write("Before performing any analysis or modeling, raw text data must be cleaned and prepared.")
 st.markdown('<p style="color:lightyellow;"><b>a. Tokenization</b></p>', unsafe_allow_html=True)
 st.write("Splits text into smaller units like words or sentences.")
@@ -26,34 +26,34 @@ st.write("Example: _'I love NLP'_ → [‘I’, ‘love’, ‘NLP’]")
 st.write("**(ii) Sentence Tokenization:** Breaking text into sentences.")
 st.write("Example: _'I love NLP. It’s fascinating!'_ → [‘I love NLP.’, ‘It’s fascinating!’]")
-st.markdown('<p style="color:lightyellow;"><b>b. Stopword Removal</b></p>', unsafe_allow_html=True)
 st.write("Removes common words like “the,” “and,” “is” that do not contribute much to analysis.")
-st.markdown('<p style="color:lightyellow;"><b>c. Stemming and Lemmatization</b></p>', unsafe_allow_html=True)
 st.write("Stemming: Reduces words to their base or root form by chopping off suffixes (may not produce valid words).")
 st.write("Example: _“running” _ → “run”")
 st.write("Lemmatization: Converts words to their base form using vocabulary and grammar")
 st.write("Example: _“good” _ → “better”")
-st.markdown('<p style="color:lightyellow;"><b>d. Part-of-Speech (POS) Tagging</b></p>', unsafe_allow_html=True)
 st.write("Labels words with their grammatical roles (noun, verb, adjective, etc.)")
 st.write("Example: _The cat sleeps”_ → [“The/DET”, “cat/NOUN”, “sleeps/VERB”]")
-st.markdown('<p style="color:lightyellow;"><b>e. Named Entity Recognition (NER)</b></p>', unsafe_allow_html=True)
 st.write("Identifies and classifies entities in text (e.g., names, dates, locations)")
 st.write("Example: _ “Barack Obama was born in Hawaii. _ ” → [Barack Obama: PERSON, Hawaii: LOCATION]")
-st.markdown('<p style="color:lightyellow;"><b>f. Text Normalization</b></p>', unsafe_allow_html=True)
 st.write("Converts text to a standard format (lowercasing, removing punctuation, etc.).")
-st.markdown('<p style="color:yellow;"><b>2. Feature Extraction Techniques</b></p>', unsafe_allow_html=True)
 st.write("Text needs to be transformed into numerical representations for machine learning models.")
-st.markdown('<p style="color:lightyellow;"><b>a. Bag of Words (BoW)</b></p>', unsafe_allow_html=True)
 st.write("Represents text as a vector of word frequencies or occurrences, ignoring grammar and order")
 st.write("Examples:")
 st.write("Text: “I love NLP” and “NLP is great”")
@@ -61,7 +61,7 @@ st.write("Vocabulary: [“I”, “love”, “NLP”, “is”, “great”]")
 st.write("Vector for “I love NLP”: [1, 1, 1, 0, 0]")
-st.markdown('<p style="color:lightyellow;"><b>b. Term Frequency-Inverse Document Frequency (TF-IDF)</b></p>', unsafe_allow_html=True)
 st.write("The **TF-IDF Vectorizer** is a popular technique in Natural Language Processing (NLP) used to convert text into numerical values that can be used by machine learning models. It stands for Term Frequency-Inverse Document Frequency and helps highlight the importance of words in a document relative to a collection of documents (called a corpus).")
 st.write('**Term Frequency (TF)** \n - Measures how often a word appears in a single document. \n - Formula: \n _TF_ = Number of times the word appears in the document / Total number of words in the document' )
@@ -89,10 +89,9 @@ st.write("""
 """)
-st.markdown('<p style="color:lightyellow;"><b>c. Word Embeddings</b></p>', unsafe_allow_html=True)
 st.write("Word embeddings are a type of representation for text where words are converted into dense numerical vectors. These vectors capture the semantic meaning of words and their relationships with other words in a way that computers can understand.")
-import streamlit as st
 st.write("""
 **Word Embedding Techniques**
@@ -132,4 +131,8 @@ The future of Natural Language Processing (NLP) is exciting, with advancements t
 **5. Multimodal Learning**
 - Beyond Text: Integrating text with images, audio, and video for richer applications like understanding memes, videos, or interactive media.
 """)

 st.subheader("NLP Techniques")
 st.markdown("<p>NLP encompasses a wide array of techniques that aimed at enabling computers to process and understand human language. These tasks can be categorized into several broad areas, each addressing different aspects of language processing. Here are some of the key NLP techniques:</p>",unsafe_allow_html= True)
+st.markdown('<p style="color:;"><b>1. Text Processing and Preprocessing In NLP</b></p>', unsafe_allow_html=True)
 st.write("Before performing any analysis or modeling, raw text data must be cleaned and prepared.")
 st.markdown('<p style="color:lightyellow;"><b>a. Tokenization</b></p>', unsafe_allow_html=True)
 st.write("Splits text into smaller units like words or sentences.")
 st.write("**(ii) Sentence Tokenization:** Breaking text into sentences.")
 st.write("Example: _'I love NLP. It’s fascinating!'_ → [‘I love NLP.’, ‘It’s fascinating!’]")
+st.markdown('<p style="color:;"><b>b. Stopword Removal</b></p>', unsafe_allow_html=True)
 st.write("Removes common words like “the,” “and,” “is” that do not contribute much to analysis.")
+st.markdown('<p style="color:;"><b>c. Stemming and Lemmatization</b></p>', unsafe_allow_html=True)
 st.write("Stemming: Reduces words to their base or root form by chopping off suffixes (may not produce valid words).")
 st.write("Example: _“running” _ → “run”")
 st.write("Lemmatization: Converts words to their base form using vocabulary and grammar")
 st.write("Example: _“good” _ → “better”")
+st.markdown('<p style="color:;"><b>d. Part-of-Speech (POS) Tagging</b></p>', unsafe_allow_html=True)
 st.write("Labels words with their grammatical roles (noun, verb, adjective, etc.)")
 st.write("Example: _The cat sleeps”_ → [“The/DET”, “cat/NOUN”, “sleeps/VERB”]")
+st.markdown('<p style="color:;"><b>e. Named Entity Recognition (NER)</b></p>', unsafe_allow_html=True)
 st.write("Identifies and classifies entities in text (e.g., names, dates, locations)")
 st.write("Example: _ “Barack Obama was born in Hawaii. _ ” → [Barack Obama: PERSON, Hawaii: LOCATION]")
+st.markdown('<p style="color:;"><b>f. Text Normalization</b></p>', unsafe_allow_html=True)
 st.write("Converts text to a standard format (lowercasing, removing punctuation, etc.).")
+st.markdown('<p style="color:;"><b>2. Feature Extraction Techniques</b></p>', unsafe_allow_html=True)
 st.write("Text needs to be transformed into numerical representations for machine learning models.")
+st.markdown('<p style="color:;"><b>a. Bag of Words (BoW)</b></p>', unsafe_allow_html=True)
 st.write("Represents text as a vector of word frequencies or occurrences, ignoring grammar and order")
 st.write("Examples:")
 st.write("Text: “I love NLP” and “NLP is great”")
 st.write("Vector for “I love NLP”: [1, 1, 1, 0, 0]")
+st.markdown('<p style="color:;"><b>b. Term Frequency-Inverse Document Frequency (TF-IDF)</b></p>', unsafe_allow_html=True)
 st.write("The **TF-IDF Vectorizer** is a popular technique in Natural Language Processing (NLP) used to convert text into numerical values that can be used by machine learning models. It stands for Term Frequency-Inverse Document Frequency and helps highlight the importance of words in a document relative to a collection of documents (called a corpus).")
 st.write('**Term Frequency (TF)** \n - Measures how often a word appears in a single document. \n - Formula: \n _TF_ = Number of times the word appears in the document / Total number of words in the document' )
 """)
+st.markdown('<p style="color:;"><b>c. Word Embeddings</b></p>', unsafe_allow_html=True)
 st.write("Word embeddings are a type of representation for text where words are converted into dense numerical vectors. These vectors capture the semantic meaning of words and their relationships with other words in a way that computers can understand.")
 st.write("""
 **Word Embedding Techniques**
 **5. Multimodal Learning**
 - Beyond Text: Integrating text with images, audio, and video for richer applications like understanding memes, videos, or interactive media.
+The future of NLP is about creating systems that communicate more naturally, inclusively, and intelligently, enabling transformative applications in every aspect of life.
 """)