|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- roberta |
|
|
- text-classification |
|
|
- healthcare |
|
|
- biomedical |
|
|
- adverse-drug-reaction |
|
|
- nlp |
|
|
datasets: |
|
|
- custom |
|
|
language: |
|
|
- en |
|
|
model-index: |
|
|
- name: RoBERTa ADR Severity Classifier |
|
|
results: |
|
|
- task: |
|
|
name: Text Classification |
|
|
type: text-classification |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.891 |
|
|
- type: f1 |
|
|
value: 0.891 |
|
|
- type: auc |
|
|
value: 0.956 |
|
|
--- |
|
|
|
|
|
# π€ RoBERTa ADR Severity Classifier |
|
|
|
|
|
This is a fine-tuned [RoBERTa](https://huggingface.co/roberta-base) model that detects **Adverse Drug Reactions (ADRs)** and classifies them as either **severe** (`1`) or **not severe** (`0`). It is trained on annotated ADR text data and is part of a broader NLP pipeline that extracts symptoms, diseases, and medications from biomedical reports. |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ Model Details |
|
|
|
|
|
- **Base Model:** `roberta-base` |
|
|
- **Task:** Binary Text Classification (`Severe` vs `Not Severe`) |
|
|
- **Training Data:** 3,000+ annotated ADR descriptions |
|
|
- **Framework:** Hugging Face Transformers + PyTorch |
|
|
|
|
|
--- |
|
|
|
|
|
## π¬ Intended Use |
|
|
|
|
|
This model is intended for **research and educational purposes** in biomedical NLP. It can be used to: |
|
|
|
|
|
- Flag potentially dangerous side effects in user-reported ADRs |
|
|
- Prioritize ADR cases based on severity |
|
|
- Serve as a backend for medical QA systems or healthcare apps |
|
|
|
|
|
--- |
|
|
|
|
|
## π Performance |
|
|
|
|
|
Evaluated on a balanced test set of 1,623 samples: |
|
|
|
|
|
| Metric | Class 0 (Not Severe) | Class 1 (Severe) | |
|
|
|------------|----------------------|------------------| |
|
|
| Precision | 0.904 | 0.880 | |
|
|
| Recall | 0.865 | 0.915 | |
|
|
| F1-Score | 0.884 | 0.897 | |
|
|
| Accuracy | **0.891** | | |
|
|
| AUC | **0.956** | | |
|
|
|
|
|
--- |
|
|
|
|
|
## π Example Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
model = AutoModelForSequenceClassification.from_pretrained("calerio-uva/roberta-adr-model") |
|
|
tokenizer = AutoTokenizer.from_pretrained("calerio-uva/roberta-adr-model") |
|
|
|
|
|
text = "Severe migraine with vision loss and vomiting after taking ibuprofen." |
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512) |
|
|
|
|
|
with torch.no_grad(): |
|
|
logits = model(**inputs).logits |
|
|
probs = torch.softmax(logits, dim=1) |
|
|
|
|
|
print(f"Not Severe: {probs[0][0]:.3f}, Severe: {probs[0][1]:.3f}") |