G.Hemanth Sai
commited on
Commit
·
6b40170
1
Parent(s):
32d9382
readme changes
Browse files
README.md
CHANGED
|
@@ -1,85 +1,10 @@
|
|
| 1 |
-
# Internship-IVIS-labs
|
| 2 |
-
|
| 3 |
-
- The *Intelligent Question Generator* app is an easy-to-use interface built in Streamlit which uses [KeyBERT](https://github.com/MaartenGr/KeyBERT), [Sense2vec](https://github.com/explosion/sense2vec), [T5](https://huggingface.co/ramsrigouthamg/t5_paraphraser)
|
| 4 |
-
- It uses a minimal keyword extraction technique that leverages multiple NLP embeddings and relies on [Transformers](https://huggingface.co/transformers/) 🤗 to create keywords/keyphrases that are most similar to a document.
|
| 5 |
-
- [sense2vec](https://github.com/explosion/sense2vec) (Trask et. al, 2015) is a nice twist on word2vec that lets you learn more interesting and detailed word vectors.
|
| 6 |
-
|
| 7 |
-
## Repository Breakdown
|
| 8 |
-
### src Directory
|
| 9 |
---
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
$ git clone https://github.com/HemanthSai7/Internship-IVIS-labs.git
|
| 20 |
-
```
|
| 21 |
-
```shell
|
| 22 |
-
$ cd Internship-IVIS-labs
|
| 23 |
-
```
|
| 24 |
-
```python
|
| 25 |
-
pip install -r requirements.txt
|
| 26 |
-
```
|
| 27 |
-
- For the running the app for the first time locally, you need to uncomment the the lines in `src/Pipeline/QuestGen.py` to download the models to the models directory.
|
| 28 |
-
|
| 29 |
-
```python
|
| 30 |
-
streamlit run app.py
|
| 31 |
-
```
|
| 32 |
-
- Once the app is running, you can access it at http://localhost:8501
|
| 33 |
-
```shell
|
| 34 |
-
You can now view your Streamlit app in your browser.
|
| 35 |
-
|
| 36 |
-
Local URL: http://localhost:8501
|
| 37 |
-
Network URL: http://192.168.0.103:8501
|
| 38 |
-
```
|
| 39 |
-
|
| 40 |
-
## Tech Stack Used
|
| 41 |
-

|
| 42 |
-

|
| 43 |
-

|
| 44 |
-

|
| 45 |
-

|
| 46 |
-

|
| 47 |
-

|
| 48 |
-

|
| 49 |
-

|
| 50 |
-

|
| 51 |
-
|
| 52 |
-
## Timeline
|
| 53 |
-
### Week 1-2:
|
| 54 |
-
#### Tasks
|
| 55 |
-
- [x] Understanding and brushing up the concepts of NLP.
|
| 56 |
-
- [x] Extracting images and text from a pdf file and storing it in a texty file.
|
| 57 |
-
- [x] Exploring various open source tools for generating questions from a given text.
|
| 58 |
-
- [x] Read papers related to the project (Bert,T5,RoBERTa etc).
|
| 59 |
-
- [x] Summarizing the extracted text using T5 base pre-trained model from the pdf file.
|
| 60 |
-
|
| 61 |
-
### Week 3-4:
|
| 62 |
-
#### Tasks
|
| 63 |
-
- [x] Understanding the concept of QA systems.
|
| 64 |
-
- [x] Created a basic script for generating questions from the text.
|
| 65 |
-
- [x] Created a basic script for finding the context of the paragraph.
|
| 66 |
-
|
| 67 |
-
### Week 5-6:
|
| 68 |
-
#### Tasks
|
| 69 |
-
|
| 70 |
-
- [x] Understanding how Transformers models work for NLP tasks Question answering and generation
|
| 71 |
-
- [x] Understanding how to use the Haystack library for QA systems.
|
| 72 |
-
- [x] Understanding how to use the Haystack library for Question generation.
|
| 73 |
-
- [x] PreProcessed the document for Haystack QA for better results .
|
| 74 |
-
|
| 75 |
-
### Week 7-8:
|
| 76 |
-
#### Tasks
|
| 77 |
-
- [x] Understanding how to generate questions intelligently.
|
| 78 |
-
- [x] Explored wordnet to find synonyms
|
| 79 |
-
- [x] Used BertWSD for disambiguating the sentence provided.
|
| 80 |
-
- [x] Used KeyBERT for finding the keywords in the document.
|
| 81 |
-
- [x] Used sense2vec for finding better words with high relatedness for the keywords generated.
|
| 82 |
-
|
| 83 |
-
### Week 9-10:
|
| 84 |
-
#### Tasks
|
| 85 |
-
- [x] Create a streamlit app to demonstrate the project.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: {{title}}
|
| 3 |
+
emoji: {{emoji}}
|
| 4 |
+
colorFrom: {{colorFrom}}
|
| 5 |
+
colorTo: {{colorTo}}
|
| 6 |
+
sdk: {{streamlit}}
|
| 7 |
+
sdk_version: {{1.13.0}}
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|