File size: 2,701 Bytes
753e4b0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56a9ccd
753e4b0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
license: apache-2.0
datasets:
- ade-benchmark-corpus/ade_corpus_v2
language:
- en
base_model:
- dmis-lab/biobert-base-cased-v1.2
pipeline_tag: text-classification
tags:
- biomedical
- nlp
- adverse-drug-effects
- bert
- biobert
---

# BioBERT for Adverse Drug Effect (ADE) Classification

This model is a fine-tuned version of [`dmis-lab/biobert-base-cased-v1.2`](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) for binary sentence classification: Does a sentence describe an **adverse drug effect (ADE)**? 
It was fine-tuned on the [ADE Corpus V2](https://huggingface.co/datasets/ade-benchmark-corpus/ade_corpus_v2) dataset and compared against a classical TF-IDF + Logistic Regression baseline as part of a broader project benchmarking classical vs. transformer approaches on imbalanced biomedical text.

**Project Repo:** [GitHub](https://github.com/steven-cheun/nlp-ade-classification)

## Results (Test Set: N=3,528)

| Model | Weighted F1 | ADE Class F1 | Accuracy | Total Errors |
|---|---|---|---|---|
| TF-IDF + Logistic Regression | 0.90 | 0.84 | 90% | 349 |
| **BioBERT (this model)** | **0.96** | **0.93** | **96%** | **145** |

BioBERT reduced misclassifications by **58%** (349 → 145 errors) compared to the classical baseline.

## Training Details

- **Base model:** `dmis-lab/biobert-base-cased-v1.2` (110M parameters)
- **Epochs:** 3 (Best checkpoint selected by validation F1)
- **Learning rate:** 2e-5
- **Batch size:** 16
- **Max sequence length:** 128
- **Precision:** fp16
- **Data split:** stratified 70/15/15 train/val/test (seed=42)

| Epoch | Train Loss | Val F1 | Val Accuracy |
|---|---|---|---|
| 1 | 0.175 | 0.943 | 0.943 |
| 2 | 0.114 | 0.952 | 0.952 |
| 3 | 0.043 | 0.952 | 0.952 |

## Usage

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("scheun/biobert-ade-classifier")
tokenizer = AutoTokenizer.from_pretrained("scheun/biobert-ade-classifier")

inputs = tokenizer("Patient developed severe nausea after taking the medication.", return_tensors="pt")
outputs = model(**inputs)
prediction = outputs.logits.argmax(-1).item()
print(prediction)  # 0 = not ADE, 1 = ADE
```

## Limitations

- Trained on MEDLINE case report sentences. Performance may vary on other text domains.
- Binary classification only. It does not extract which drug or which effect is mentioned.

## References

- Gurulingappa et al. (2012), *Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports*
- Lee et al. (2020), *BioBERT: a pre-trained biomedical language representation model for biomedical text mining*