metadata
license: mit
tags:
- roberta
- text-classification
- healthcare
- biomedical
- adverse-drug-reaction
- nlp
datasets:
- custom
language:
- en
model-index:
- name: RoBERTa ADR Severity Classifier
results:
- task:
name: Text Classification
type: text-classification
metrics:
- type: accuracy
value: 0.891
- type: f1
value: 0.891
- type: auc
value: 0.956
π€ RoBERTa ADR Severity Classifier
This is a fine-tuned RoBERTa model that detects Adverse Drug Reactions (ADRs) and classifies them as either severe (1) or not severe (0). It is trained on annotated ADR text data and is part of a broader NLP pipeline that extracts symptoms, diseases, and medications from biomedical reports.
π§ Model Details
- Base Model:
roberta-base - Task: Binary Text Classification (
SeverevsNot Severe) - Training Data: 3,000+ annotated ADR descriptions
- Framework: Hugging Face Transformers + PyTorch
π¬ Intended Use
This model is intended for research and educational purposes in biomedical NLP. It can be used to:
- Flag potentially dangerous side effects in user-reported ADRs
- Prioritize ADR cases based on severity
- Serve as a backend for medical QA systems or healthcare apps
π Performance
Evaluated on a balanced test set of 1,623 samples:
| Metric | Class 0 (Not Severe) | Class 1 (Severe) |
|---|---|---|
| Precision | 0.904 | 0.880 |
| Recall | 0.865 | 0.915 |
| F1-Score | 0.884 | 0.897 |
| Accuracy | 0.891 | |
| AUC | 0.956 |
π Example Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained("calerio-uva/roberta-adr-model")
tokenizer = AutoTokenizer.from_pretrained("calerio-uva/roberta-adr-model")
text = "Severe migraine with vision loss and vomiting after taking ibuprofen."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=1)
print(f"Not Severe: {probs[0][0]:.3f}, Severe: {probs[0][1]:.3f}")