--- license: apache-2.0 datasets: - uzw/PlainFact language: - en metrics: - accuracy pipeline_tag: text-classification tags: - biology - medical - classification --- > This plain language summary classification model is a part of the [PlainQAFact](https://github.com/zhiwenyou103/PlainQAFact) factuality evaluation framework. ## Classify the Input into Either Elaborative Explanation or Simplification We fine-tuned [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext) model using our curated sentence-level [PlainFact](https://huggingface.co/datasets/uzw/PlainFact) dataset. ## Model Overview [PubMedBERT](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext) is a BERT model pre-trained from scratch on PubMed abstracts and full-text articles. It's optimized for biomedical text understanding and can be fine-tuned for various classification tasks such as: - Medical document classification - Disease/symptom categorization - Clinical note classification - Biomedical relation extraction ## How to use Here is how to use this model in PyTorch: ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch # Load tokenizer and model model_name = "uzw/plainqafact-pls-classifier" tokenizer = AutoTokenizer.from_pretrained(model_name) num_labels = 2 # e.g., binary classification model = AutoModelForSequenceClassification.from_pretrained( model_name, num_labels=num_labels ) # Example text text = "Patient presents with acute myocardial infarction and elevated troponin levels." inputs = tokenizer( text, padding=True, truncation=True, max_length=512, return_tensors="pt" ) # Get predictions model.eval() with torch.no_grad(): outputs = model(**inputs) predictions = torch.nn.functional.softmax(outputs.logits, dim=-1) predicted_class = torch.argmax(predictions, dim=-1) print(f"Predicted class: {predicted_class.item()}") print(f"Confidence scores: {predictions}") ``` ## Citation If you use this QG model in your research, please cite with the following BibTex entry: ``` @misc{you2025plainqafactretrievalaugmentedfactualconsistency, title={PlainQAFact: Retrieval-augmented Factual Consistency Evaluation Metric for Biomedical Plain Language Summarization}, author={Zhiwen You and Yue Guo}, year={2025}, eprint={2503.08890}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2503.08890}, } ``` > Code: https://github.com/zhiwenyou103/PlainQAFact