uzw's picture
Update README.md
480fb2c verified
---
license: apache-2.0
datasets:
- uzw/PlainFact
language:
- en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- biology
- medical
- classification
---
> This plain language summary classification model is a part of the [PlainQAFact](https://github.com/zhiwenyou103/PlainQAFact) factuality evaluation framework.
## Classify the Input into Either Elaborative Explanation or Simplification
We fine-tuned [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext) model using our curated sentence-level [PlainFact](https://huggingface.co/datasets/uzw/PlainFact) dataset.
## Model Overview
[PubMedBERT](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext) is a BERT model pre-trained from scratch on PubMed abstracts and full-text articles. It's optimized for biomedical text understanding and can be fine-tuned for various classification tasks such as:
- Medical document classification
- Disease/symptom categorization
- Clinical note classification
- Biomedical relation extraction
## How to use
Here is how to use this model in PyTorch:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load tokenizer and model
model_name = "uzw/plainqafact-pls-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
num_labels = 2 # e.g., binary classification
model = AutoModelForSequenceClassification.from_pretrained(
model_name,
num_labels=num_labels
)
# Example text
text = "Patient presents with acute myocardial infarction and elevated troponin levels."
inputs = tokenizer(
text,
padding=True,
truncation=True,
max_length=512,
return_tensors="pt"
)
# Get predictions
model.eval()
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
predicted_class = torch.argmax(predictions, dim=-1)
print(f"Predicted class: {predicted_class.item()}")
print(f"Confidence scores: {predictions}")
```
## Citation
If you use this QG model in your research, please cite with the following BibTex entry:
```
@misc{you2025plainqafactretrievalaugmentedfactualconsistency,
title={PlainQAFact: Retrieval-augmented Factual Consistency Evaluation Metric for Biomedical Plain Language Summarization},
author={Zhiwen You and Yue Guo},
year={2025},
eprint={2503.08890},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.08890},
}
```
> Code: https://github.com/zhiwenyou103/PlainQAFact