Update README.md
Browse files
README.md
CHANGED
|
@@ -22,7 +22,7 @@ most popular characters. In total about 6k lines of dialogs were collected. For
|
|
| 22 |
First part of training based in this notebook [HW1_NLP_GEN_TFIDF_W2V_BERT_StarodubovKG.ipynb](https://huggingface.co/spaces/StKirill/chatbot/blob/main/HW1_NLP_GEN_TFIDF_W2V_BERT_StarodubovKG.ipynb)
|
| 23 |
TF-IDF, W2V, BERT algorithms are considered inside. For cleaning and preparing data for first two algorithms nltk library is used. From the data punctuation and stopwords are removed, also lemmatization is used to short form.
|
| 24 |
Chart below shows that length of sentences in data not exceeds 60 words.
|
| 25 |
-
)
|
| 26 |
Chatbot with TF-IDF convert "question" to vector, find equivalent or most relevant vector in database and for this vector extract answer which sends to user.
|
| 27 |
Experiments shows that TF-IDF gives good results from the boxChatBot with TF-IDF. It works perfect if we have all possible questions and answers.
|
| 28 |
|
|
|
|
| 22 |
First part of training based in this notebook [HW1_NLP_GEN_TFIDF_W2V_BERT_StarodubovKG.ipynb](https://huggingface.co/spaces/StKirill/chatbot/blob/main/HW1_NLP_GEN_TFIDF_W2V_BERT_StarodubovKG.ipynb)
|
| 23 |
TF-IDF, W2V, BERT algorithms are considered inside. For cleaning and preparing data for first two algorithms nltk library is used. From the data punctuation and stopwords are removed, also lemmatization is used to short form.
|
| 24 |
Chart below shows that length of sentences in data not exceeds 60 words.
|
| 25 |
+

|
| 26 |
Chatbot with TF-IDF convert "question" to vector, find equivalent or most relevant vector in database and for this vector extract answer which sends to user.
|
| 27 |
Experiments shows that TF-IDF gives good results from the boxChatBot with TF-IDF. It works perfect if we have all possible questions and answers.
|
| 28 |
|