nurdyansa commited on
Commit
f03db73
·
verified ·
1 Parent(s): 403aaf5

# 🧠 Framing BERT Model (Multi-Label Classification)

A fine-tuned `bert-base-uncased` model for **multi-label classification** of *news framing elements*, based on a custom annotated dataset of news texts. The model detects four common elements of framing in political and social discourse.

## 📚 Labels

# Framing Elements Classification with BERT

## Overview

This project focuses on classifying **framing elements** in news text based on the framing theory of **Robert Entman (1993)**. We fine-tuned a BERT-based sequence classification model to predict the presence of the following four core framing elements:

- **`define_problem`** – Whether the text defines a social or political problem.
- **`diagnose_cause`** – Whether the text attributes causes or sources for the issue.
- **`moral_judgment`** – Whether the text expresses a normative or value-laden evaluation.
- **`suggest_remedy`** – Whether the text suggests solutions, actions, or remedies.

---

## Model

- **Base Model:** `bert-base-uncased`
- **Architecture:** `BertForSequenceClassification`
- **Task Type:** Multi-label classification (binary prediction per label)
- **Tokenizer:** `BertTokenizer` from HuggingFace

---

## Training Details

- **Dataset:** Custom annotated dataset labeled with Entman's framing elements.
- **Training Strategy:** Fine-tuning using HuggingFace Transformers and Optuna for hyperparameter tuning.
- **Validation Split:** 80/20 train-validation split
- **Evaluation Metrics:**
- Accuracy
- F1 Macro
- Precision Macro
- Recall Macro

### Best Hyperparameters (Optuna)
- **Learning Rate:** `4.24e-5`
- **Weight Decay:** `0.222`
- **Epochs:** `3`

---

## Training Performance

| Epoch | Train Loss | Val Loss | Accuracy | F1 Macro | Precision Macro | Recall Macro |
|-------|------------|----------|----------|----------|------------------|---------------|
| 1 | 0.6594 | 0.6530 | 0.2200 | 0.6758 | 0.5775 | 0.8178 |
| 2 | 0.6027 | 0.6355 | 0.2317 | 0.6329 | 0.6204 | 0.6490 |
| 3 | 0.5577 | 0.6404 | 0.2283 | 0.6386 | 0.6280 | 0.6566 |

- **Best F1 Macro:** `0.6386` (Trial 2)

---

## Final Prediction Example

**Predicted Framing Elements:**

```json
{
"define_problem": true,
"diagnose_cause": true,
"moral_judgment": true,
"suggest_remedy": true
}


## 🚀 Usage

```python
from transformers import BertTokenizerFast, BertForSequenceClassification
import torch

model = BertForSequenceClassification.from_pretrained("nurdyansa/framing-bert-model")
tokenizer = BertTokenizerFast.from_pretrained("nurdyansa/framing-bert-model")

text = "The government is failing to address the root causes of inflation."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)

with torch.no_grad():
logits = model(**inputs).logits
probs = torch.sigmoid(logits).squeeze()
predictions = (probs > 0.5).int()

# Map predictions to labels
labels = ["define_problem", "diagnose_cause", "moral_judgment", "suggest_remedy"]
output = {label: bool(pred) for label, pred in zip(labels, predictions.tolist())}
print(output)

Files changed (1) hide show
  1. README.md +20 -0
README.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ metrics:
6
+ - f1
7
+ - precision
8
+ - recall
9
+ base_model:
10
+ - google-bert/bert-base-uncased
11
+ pipeline_tag: text-classification
12
+ library_name: transformers
13
+ tags:
14
+ - framing
15
+ - multi-label
16
+ - bert
17
+ - optuna
18
+ - classification
19
+ - social-science
20
+ ---