Spaces:

Rajesh6
/

NLP

Sleeping

Rajesh6 commited on Nov 23, 2024

Commit

cf7d4d0

verified ·

1 Parent(s): ed78dd2

Update pages/Introduction.py

Files changed (1) hide show

pages/Introduction.py CHANGED Viewed

@@ -63,6 +63,6 @@ st.write("Vector for “I love NLP”: [1, 1, 1, 0, 0]")
 st.markdown('<p style="color:lightblue;"><b>b. Term Frequency-Inverse Document Frequency (TF-IDF)</b></p>', unsafe_allow_html=True)
 st.write("The **TF-IDF Vectorizer** is a popular technique in Natural Language Processing (NLP) used to convert text into numerical values that can be used by machine learning models. It stands for Term Frequency-Inverse Document Frequency and helps highlight the importance of words in a document relative to a collection of documents (called a corpus).")
-st.write("**Term Frequency (TF)** \n - Measures how often a word appears in a single document. \n - Formula: \n _ TF _ = Number of times the word appears in the document / Total number of words in the document" )
-st.write("**Inverse Document Frequency (IDF)** \n Measures how unique or rare a word is across all documents in the corpus. \n - Formula: \n  _ IDF _ = log(Total no.of documents / No of Documnets containing the word) \n Words that appear in many documents (like "the" or "and") will have a low IDF value, while unique words (like "NLP") will have a higher IDF.")
-st.write("**TF - IDF Score: \n - Combines TF and IDF to calculate the importance of a word in a document. \n - Formula: \n TF - IDF = TF x IDF \n Words that are frequent in a document but rare in the overall corpus get a higher score.")

 st.markdown('<p style="color:lightblue;"><b>b. Term Frequency-Inverse Document Frequency (TF-IDF)</b></p>', unsafe_allow_html=True)
 st.write("The **TF-IDF Vectorizer** is a popular technique in Natural Language Processing (NLP) used to convert text into numerical values that can be used by machine learning models. It stands for Term Frequency-Inverse Document Frequency and helps highlight the importance of words in a document relative to a collection of documents (called a corpus).")
+st.write('**Term Frequency (TF)** \n - Measures how often a word appears in a single document. \n - Formula: \n _ TF _ = Number of times the word appears in the document / Total number of words in the document' )
+st.write('**Inverse Document Frequency (IDF)** \n Measures how unique or rare a word is across all documents in the corpus. \n - Formula: \n  _ IDF _ = log(Total no.of documents / No of Documnets containing the word) \n Words that appear in many documents (like "the" or "and") will have a low IDF value, while unique words (like "NLP") will have a higher IDF.')
+st.write('**TF - IDF Score: \n - Combines TF and IDF to calculate the importance of a word in a document. \n - Formula: \n TF - IDF = TF x IDF \n Words that are frequent in a document but rare in the overall corpus get a higher score.')