Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
tags:
|
| 4 |
+
- t5
|
| 5 |
+
datasets:
|
| 6 |
+
- squad
|
| 7 |
+
license: mit
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
# Question Generation Model
|
| 11 |
+
|
| 12 |
+
## Fine-tuning Dataset
|
| 13 |
+
|
| 14 |
+
SQuAD 1.1
|
| 15 |
+
|
| 16 |
+
## Demo
|
| 17 |
+
|
| 18 |
+
https://huggingface.co/Sehong/t5-large-QuestionGeneration
|
| 19 |
+
|
| 20 |
+
## How to use
|
| 21 |
+
|
| 22 |
+
```python
|
| 23 |
+
import torch
|
| 24 |
+
from transformers import PreTrainedTokenizerFast
|
| 25 |
+
from transformers import T5ForConditionalGeneration
|
| 26 |
+
|
| 27 |
+
tokenizer = PreTrainedTokenizerFast.from_pretrained('Sehong/t5-large-QuestionGeneration')
|
| 28 |
+
model = T5ForConditionalGeneration.from_pretrained('Sehong/t5-large')
|
| 29 |
+
|
| 30 |
+
text = "Saint Bern ##ade ##tte So ##ubi ##rous [SEP] Architectural ##ly , the school has a Catholic character . At ##op the Main Building ' s gold dome is a golden statue of the Virgin Mary . Immediately in front of the Main Building and facing it , is a copper statue of Christ with arms up ##rai ##sed with the legend "" V ##eni ##te Ad Me O ##m ##nes "" . Next to the Main Building is the Basilica of the Sacred Heart . Immediately behind the b ##asi ##lica is the G ##rot ##to , a Marian place of prayer and reflection . It is a replica of the g ##rot ##to at Lou ##rdes , France where the Virgin Mary reputed ##ly appeared to Saint Bern ##ade ##tte So ##ubi ##rous in 1858 . At the end of the main drive ( and in a direct line that connects through 3 statues and the Gold Dome ) , is a simple , modern stone statue of Mary ."
|
| 31 |
+
|
| 32 |
+
raw_input_ids = tokenizer.encode(text)
|
| 33 |
+
input_ids = [tokenizer.bos_token_id] + raw_input_ids + [tokenizer.eos_token_id]
|
| 34 |
+
|
| 35 |
+
summary_ids = model.generate(torch.tensor([input_ids]))
|
| 36 |
+
|
| 37 |
+
decode = tokenizer.decode(summary_ids.squeeze().tolist(), skip_special_tokens=True)
|
| 38 |
+
|
| 39 |
+
decode = decode.replace(' # # ', '').replace(' ', ' ').replace(' ##', '')
|
| 40 |
+
|
| 41 |
+
print(decode)
|
| 42 |
+
```
|