README.md
Browse files# 🧠 Framing BERT Model (Multi-Label Classification)
A fine-tuned `bert-base-uncased` model for **multi-label classification** of *news framing elements*, based on a custom annotated dataset of news texts. The model detects four common elements of framing in political and social discourse.
## 📚 Labels
# Framing Elements Classification with BERT
## Overview
This project focuses on classifying **framing elements** in news text based on the framing theory of **Robert Entman (1993)**. We fine-tuned a BERT-based sequence classification model to predict the presence of the following four core framing elements:
- **`define_problem`** – Whether the text defines a social or political problem.
- **`diagnose_cause`** – Whether the text attributes causes or sources for the issue.
- **`moral_judgment`** – Whether the text expresses a normative or value-laden evaluation.
- **`suggest_remedy`** – Whether the text suggests solutions, actions, or remedies.
---
## Model
- **Base Model:** `bert-base-uncased`
- **Architecture:** `BertForSequenceClassification`
- **Task Type:** Multi-label classification (binary prediction per label)
- **Tokenizer:** `BertTokenizer` from HuggingFace
---
## Training Details
- **Dataset:** Custom annotated dataset labeled with Entman's framing elements.
- **Training Strategy:** Fine-tuning using HuggingFace Transformers and Optuna for hyperparameter tuning.
- **Validation Split:** 80/20 train-validation split
- **Evaluation Metrics:**
- Accuracy
- F1 Macro
- Precision Macro
- Recall Macro
### Best Hyperparameters (Optuna)
- **Learning Rate:** `4.24e-5`
- **Weight Decay:** `0.222`
- **Epochs:** `3`
---
## Training Performance
| Epoch | Train Loss | Val Loss | Accuracy | F1 Macro | Precision Macro | Recall Macro |
|-------|------------|----------|----------|----------|------------------|---------------|
| 1 | 0.6594 | 0.6530 | 0.2200 | 0.6758 | 0.5775 | 0.8178 |
| 2 | 0.6027 | 0.6355 | 0.2317 | 0.6329 | 0.6204 | 0.6490 |
| 3 | 0.5577 | 0.6404 | 0.2283 | 0.6386 | 0.6280 | 0.6566 |
- **Best F1 Macro:** `0.6386` (Trial 2)
---
## Final Prediction Example
**Predicted Framing Elements:**
```json
{
"define_problem": true,
"diagnose_cause": true,
"moral_judgment": true,
"suggest_remedy": true
}
## 🚀 Usage
```python
from transformers import BertTokenizerFast, BertForSequenceClassification
import torch
model = BertForSequenceClassification.from_pretrained("nurdyansa/framing-bert-model")
tokenizer = BertTokenizerFast.from_pretrained("nurdyansa/framing-bert-model")
text = "The government is failing to address the root causes of inflation."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.sigmoid(logits).squeeze()
predictions = (probs > 0.5).int()
# Map predictions to labels
labels = ["define_problem", "diagnose_cause", "moral_judgment", "suggest_remedy"]
output = {label: bool(pred) for label, pred in zip(labels, predictions.tolist())}
print(output)
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
metrics:
|
| 6 |
+
- f1
|
| 7 |
+
- precision
|
| 8 |
+
- recall
|
| 9 |
+
base_model:
|
| 10 |
+
- google-bert/bert-base-uncased
|
| 11 |
+
pipeline_tag: text-classification
|
| 12 |
+
library_name: transformers
|
| 13 |
+
tags:
|
| 14 |
+
- framing
|
| 15 |
+
- multi-label
|
| 16 |
+
- bert
|
| 17 |
+
- optuna
|
| 18 |
+
- classification
|
| 19 |
+
- social-science
|
| 20 |
+
---
|