Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,54 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
tags:
|
| 5 |
+
- semantic-role-labeling
|
| 6 |
+
- question-answer generation
|
| 7 |
+
- pytorch
|
| 8 |
+
datasets:
|
| 9 |
+
- kleinay/qanom
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# A Seq2Seq model for QANom parsing
|
| 13 |
+
|
| 14 |
+
This is a `t5-small` pretrained model, fine-tuned on the task of generating QANom QAs.
|
| 15 |
+
|
| 16 |
+
"QANom" stands for "QASRL for Nominalizations", which is an adaptation of [QASRL (Question-Answer driven Semantic Role Labeling)](www.qasrl.org) for the nominal predicates domain. See the [QANom paper](https://aclanthology.org/2020.coling-main.274/) for details about the task. The QANom Dataset official site is a [Google drive](https://drive.google.com/drive/folders/15PHKVdPm65ysgdkV47z6J_73kETk7_of), but we also wrapped it into a [Huggingface Dataset](https://huggingface.co/datasets/biu-nlp/qanom), which is easier to plug-and-play with (check out our [HF profile](https://huggingface.co/biu-nlp) for other related datasets, such as QASRL, QAMR, QADiscourse, and QA-Align).
|
| 17 |
+
|
| 18 |
+
## Demo
|
| 19 |
+
|
| 20 |
+
Visit [our demo](https://huggingface.co/spaces/kleinay/qanom-seq2seq-demo) for interactively exploring our model!
|
| 21 |
+
|
| 22 |
+
## Usage
|
| 23 |
+
|
| 24 |
+
The model and tokenizer can be downloaded as simply as running:
|
| 25 |
+
```python
|
| 26 |
+
import transformers
|
| 27 |
+
model = transformers.AutoModelForSeq2SeqLM.from_pretrained("kleinay/qanom-seq2seq-model-baseline")
|
| 28 |
+
tokenizer = transformers.AutoTokenizer.from_pretrained("kleinay/qanom-seq2seq-model-baseline")
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
However, the model fine-tuning procedure involves input preprocessing (marking the predicate in the sentence, T5's "task prefix", incorporating the predicate type and/or the verbal for of the nominalization) and output postprocessing (parsing the sequence into a list of QASRL-formatted QAs).
|
| 32 |
+
In order to use the model for QANom parsing easily, we suggest downloading the `pipeline.py` file from this repository, and then use the `QASRL_Pipeline` class:
|
| 33 |
+
|
| 34 |
+
```python
|
| 35 |
+
from pipeline import QASRL_Pipeline
|
| 36 |
+
pipe = QASRL_Pipeline("kleinay/qanom-seq2seq-model-baseline")
|
| 37 |
+
pipe("The student was interested in Luke 's <predicate> research about see animals .", verb_form="research", predicate_type="nominal")
|
| 38 |
+
```
|
| 39 |
+
Which will output:
|
| 40 |
+
```json
|
| 41 |
+
[{'generated_text': 'who _ _ researched something _ _ ?<extra_id_7> Luke',
|
| 42 |
+
'QAs': [{'question': 'who researched something ?', 'answers': ['Luke']}]}]
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
Notice that you need to specify which word in the sentence is the predicate, about which the question will interrogate. By default, you should precede the predicate with the `<predicate>` symbol, but you can also specify your own predicate marker:
|
| 46 |
+
```python
|
| 47 |
+
pipe("The student was interested in Luke 's <PRED> research about see animals .", verb_form="research", predicate_type="nominal", predicate_marker="<PRED>")
|
| 48 |
+
```
|
| 49 |
+
In addition, you can specify additional kwargs for controling the model's decoding algorithm:
|
| 50 |
+
```python
|
| 51 |
+
pipe("The student was interested in Luke 's <predicate> research about see animals .", verb_form="research", predicate_type="nominal", num_beams=3)
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
|