| | --- |
| | base_model: |
| | - facebook/bart-large |
| | language: |
| | - en |
| | license: apache-2.0 |
| | library_name: pytorch |
| | pipeline_tag: question-generation |
| | --- |
| | |
| | > This Question Generation model is a part of the [PlainQAFact](https://github.com/zhiwenyou103/PlainQAFact) factuality evaluation framework. |
| |
|
| |
|
| | ## Generating Questions Given Context and Answers |
| |
|
| | Traditional BART model is not pre-trained on QG tasks. We fine-tuned `facebook/bart-large` model using 55k human-created question answering pairs with contexts collected by [Demszky et al. (2018)](https://arxiv.org/abs/1809.02922). The dataset includes SQuAD and QA2D question answering pairs associated with contexts. |
| |
|
| | ## How to use |
| | Here is how to use this model in PyTorch: |
| | ```python |
| | from transformers import BartForConditionalGeneration, BartTokenizer |
| | import torch |
| | |
| | tokenizer = BartTokenizer.from_pretrained('uzw/bart-large-question-generation') |
| | model = BartForConditionalGeneration.from_pretrained('uzw/bart-large-question-generation') |
| | |
| | context = "The Thug cult resides at the Pankot Palace." |
| | answer = "The Thug cult" |
| | |
| | inputs = tokenizer.encode_plus( |
| | context, |
| | answer, |
| | max_length=512, |
| | padding='max_length', |
| | truncation=True, |
| | return_tensors='pt' |
| | ) |
| | |
| | with torch.no_grad(): |
| | generated_ids = model.generate( |
| | input_ids=inputs['input_ids'], |
| | attention_mask=inputs['attention_mask'], |
| | max_length=64, # Maximum length of generated question |
| | num_return_sequences=3, # Generate multiple questions |
| | do_sample=True, # Enable sampling for diversity |
| | temperature=0.7 # Control randomness of generation |
| | ) |
| | |
| | generated_questions = tokenizer.batch_decode( |
| | generated_ids, |
| | skip_special_tokens=True |
| | ) |
| | |
| | for i, question in enumerate(generated_questions, 1): |
| | print(f"Generated Question {i}: {question}") |
| | ``` |
| |
|
| | Adjusting parameter `num_return_sequences` to generate multiple questions. |
| |
|
| |
|
| | ## Citation |
| | If you use this QG model in your research, please cite with the following BibTex entry: |
| | ``` |
| | @misc{you2025plainqafactautomaticfactualityevaluation, |
| | title={PlainQAFact: Automatic Factuality Evaluation Metric for Biomedical Plain Language Summaries Generation}, |
| | author={Zhiwen You and Yue Guo}, |
| | year={2025}, |
| | eprint={2503.08890}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2503.08890}, |
| | } |
| | ``` |
| |
|
| | Code: https://github.com/zhiwenyou103/PlainQAFact |