Commit
·
2cd8f03
1
Parent(s):
8b64d21
Update README.md
Browse files
README.md
CHANGED
|
@@ -39,7 +39,7 @@ Keyphrase extraction is a technique in text analysis where you extract the impor
|
|
| 39 |
|
| 40 |
## 📓 Model Description
|
| 41 |
This model is a fine-tuned KeyBART model on the Inspec dataset.
|
| 42 |
-
KeyBART focuses on learning a better representation of keyphrases in a generative setting. It produces the keyphrases associated with the input. This is accomplished by predicting the original input based on a changed input. The input is changed by token masking, keyphrase masking and keyphrase replacement.
|
| 43 |
You can find more information about the architecture in this paper: https://arxiv.org/abs/2112.08547.
|
| 44 |
|
| 45 |
Kulkarni, Mayank, Debanjan Mahata, Ravneet Arora, and Rajarshi Bhowmik. "Learning Rich Representation of Keyphrases from Text." arXiv preprint arXiv:2112.08547 (2021).
|
|
@@ -59,7 +59,6 @@ from transformers import (
|
|
| 59 |
AutoModelForSeq2SeqLM,
|
| 60 |
AutoTokenizer,
|
| 61 |
)
|
| 62 |
-
import numpy as np
|
| 63 |
|
| 64 |
|
| 65 |
class KeyphraseGenerationPipeline(Text2TextGenerationPipeline):
|
|
@@ -76,7 +75,8 @@ class KeyphraseGenerationPipeline(Text2TextGenerationPipeline):
|
|
| 76 |
results = super().postprocess(
|
| 77 |
model_outputs=model_outputs
|
| 78 |
)
|
| 79 |
-
return [[keyphrase.strip() for keyphrase in result.get("generated_text").split(self.keyphrase_sep_token)] for result in results]
|
|
|
|
| 80 |
```
|
| 81 |
|
| 82 |
```python
|
|
@@ -88,10 +88,11 @@ generator = KeyphraseGenerationPipeline(model=model_name)
|
|
| 88 |
text = """
|
| 89 |
Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
|
| 90 |
Since this is a time-consuming process, Artificial Intelligence is used to automate it.
|
| 91 |
-
Currently, classical machine learning methods, that use statistics and linguistics,
|
| 92 |
-
The fact that these methods have been widely used in the community
|
| 93 |
-
|
| 94 |
-
|
|
|
|
| 95 |
""".replace(
|
| 96 |
"\n", ""
|
| 97 |
)
|
|
@@ -210,4 +211,4 @@ The model achieves the following results on the Inspec test set:
|
|
| 210 |
For more information on the evaluation process, you can take a look at the keyphrase extraction evaluation notebook.
|
| 211 |
|
| 212 |
## 🚨 Issues
|
| 213 |
-
Please feel free to
|
|
|
|
| 39 |
|
| 40 |
## 📓 Model Description
|
| 41 |
This model is a fine-tuned KeyBART model on the Inspec dataset.
|
| 42 |
+
KeyBART focuses on learning a better representation of keyphrases in a generative setting. It produces the keyphrases associated with the input. This is accomplished by predicting the original input based on a changed input. The input is changed by token masking, keyphrase masking and keyphrase replacement. This model can already be used without any fine-tuning, but can be fine-tuned if needed.
|
| 43 |
You can find more information about the architecture in this paper: https://arxiv.org/abs/2112.08547.
|
| 44 |
|
| 45 |
Kulkarni, Mayank, Debanjan Mahata, Ravneet Arora, and Rajarshi Bhowmik. "Learning Rich Representation of Keyphrases from Text." arXiv preprint arXiv:2112.08547 (2021).
|
|
|
|
| 59 |
AutoModelForSeq2SeqLM,
|
| 60 |
AutoTokenizer,
|
| 61 |
)
|
|
|
|
| 62 |
|
| 63 |
|
| 64 |
class KeyphraseGenerationPipeline(Text2TextGenerationPipeline):
|
|
|
|
| 75 |
results = super().postprocess(
|
| 76 |
model_outputs=model_outputs
|
| 77 |
)
|
| 78 |
+
return [[keyphrase.strip() for keyphrase in result.get("generated_text").split(self.keyphrase_sep_token) if keyphrase != ""] for result in results]
|
| 79 |
+
|
| 80 |
```
|
| 81 |
|
| 82 |
```python
|
|
|
|
| 88 |
text = """
|
| 89 |
Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
|
| 90 |
Since this is a time-consuming process, Artificial Intelligence is used to automate it.
|
| 91 |
+
Currently, classical machine learning methods, that use statistics and linguistics,
|
| 92 |
+
are widely used for the extraction process. The fact that these methods have been widely used in the community
|
| 93 |
+
has the advantage that there are many easy-to-use libraries. Now with the recent innovations in NLP,
|
| 94 |
+
transformers can be used to improve keyphrase extraction. Transformers also focus on the semantics
|
| 95 |
+
and context of a document, which is quite an improvement.
|
| 96 |
""".replace(
|
| 97 |
"\n", ""
|
| 98 |
)
|
|
|
|
| 211 |
For more information on the evaluation process, you can take a look at the keyphrase extraction evaluation notebook.
|
| 212 |
|
| 213 |
## 🚨 Issues
|
| 214 |
+
Please feel free to start discussions in the Community Tab.
|