This plain language summary classification model is a part of the PlainQAFact factuality evaluation framework.

Classify the Input into Either Elaborative Explanation or Simplification

We fine-tuned microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext model using our curated sentence-level PlainFact dataset.

Model Overview

PubMedBERT is a BERT model pre-trained from scratch on PubMed abstracts and full-text articles. It's optimized for biomedical text understanding and can be fine-tuned for various classification tasks such as:

  • Medical document classification
  • Disease/symptom categorization
  • Clinical note classification
  • Biomedical relation extraction

How to use

Here is how to use this model in PyTorch:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load tokenizer and model
model_name = "uzw/plainqafact-pls-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)

num_labels = 2  # e.g., binary classification
model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    num_labels=num_labels
)

# Example text
text = "Patient presents with acute myocardial infarction and elevated troponin levels."

inputs = tokenizer(
    text,
    padding=True,
    truncation=True,
    max_length=512,
    return_tensors="pt"
)

# Get predictions
model.eval()
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1)

print(f"Predicted class: {predicted_class.item()}")
print(f"Confidence scores: {predictions}")

Citation

If you find this classifier is useful for your research, please cite our work with the following BibTex entry:

@article{YOU2026105019,
    title = {PlainQAFact: Retrieval-augmented factual consistency evaluation metric for biomedical plain language summarization},
    journal = {Journal of Biomedical Informatics},
    volume = {178},
    pages = {105019},
    year = {2026},
    issn = {1532-0464},
    doi = {https://doi.org/10.1016/j.jbi.2026.105019},
    url = {https://www.sciencedirect.com/science/article/pii/S1532046426000432},
    author = {Zhiwen You and Yue Guo},
    keywords = {Plain language summarization, Factual consistency evaluation, Retrieval-augmented generation, Hallucination, Large language models}
}

Code: https://github.com/zhiwenyou103/PlainQAFact

Downloads last month
23
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train uzw/plainqafact-pls-classifier