bart-large-question-generation / README.md

nielsr HF Staff

Improve model card: metadata & paper link

e824d21 verified 5 months ago

preview code

raw

history blame

2.71 kB

metadata

base_model:
  - facebook/bart-large
language:
  - en
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation
tags:
  - question-generation
datasets:
  - uzw/PlainFact

uzw/bart-large-question-generation

This Question Generation model is part of the PlainQAFact: Retrieval-augmented Factual Consistency Evaluation Metric for Biomedical Plain Language Summarization factuality evaluation framework. For more details, refer to the paper and the GitHub repository.

Generating Questions Given Context and Answers

Traditional BART model is not pre-trained on QG tasks. We fine-tuned facebook/bart-large model using 55k human-created question answering pairs with contexts collected by Demszky et al. (2018). The dataset includes SQuAD and QA2D question answering pairs associated with contexts.

How to use

Here is how to use this model in PyTorch:

from transformers import BartForConditionalGeneration, BartTokenizer
import torch

tokenizer = BartTokenizer.from_pretrained('uzw/bart-large-question-generation')
model = BartForConditionalGeneration.from_pretrained('uzw/bart-large-question-generation')

context = "The Thug cult resides at the Pankot Palace."
answer = "The Thug cult"

inputs = tokenizer.encode_plus(
    context, 
    answer, 
    max_length=512, 
    padding='max_length', 
    truncation=True, 
    return_tensors='pt'
)

with torch.no_grad():
    generated_ids = model.generate(
        input_ids=inputs['input_ids'], 
        attention_mask=inputs['attention_mask'],
        max_length=64,  # Maximum length of generated question
        num_return_sequences=3,  # Generate multiple questions
        do_sample=True,  # Enable sampling for diversity
        temperature=0.7  # Control randomness of generation
    )

generated_questions = tokenizer.batch_decode(
    generated_ids, 
    skip_special_tokens=True
)

for i, question in enumerate(generated_questions, 1):
    print(f"Generated Question {i}: {question}")

Adjusting parameter num_return_sequences to generate multiple questions.

Citation

If you use this QG model in your research, please cite with the following BibTex entry:

@misc{you2025plainqafactautomaticfactualityevaluation,
      title={PlainQAFact: Automatic Factuality Evaluation Metric for Biomedical Plain Language Summaries Generation}, 
      author={Zhiwen You and Yue Guo},
      year={2025},
      eprint={2503.08890},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2503.08890}, 
}