Update README.md
Browse files
README.md
CHANGED
|
@@ -35,124 +35,101 @@ This model is a fine-tuned BERT-based classifier designed to detect media framin
|
|
| 35 |
- **License:** MIT
|
| 36 |
- **Finetuned from model:** `bert-base-uncased`
|
| 37 |
|
| 38 |
-
### Model Sources
|
| 39 |
|
| 40 |
-
-
|
| 41 |
-
- **Demo:** Available via Hugging Face inference API
|
| 42 |
|
| 43 |
-
|
| 44 |
|
| 45 |
-
|
| 46 |
|
| 47 |
-
The model
|
| 48 |
|
| 49 |
-
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
-
|
| 52 |
|
| 53 |
-
|
| 54 |
|
| 55 |
-
-
|
| 56 |
-
-
|
|
|
|
|
|
|
| 57 |
|
| 58 |
-
##
|
| 59 |
|
| 60 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
|
| 62 |
-
###
|
| 63 |
|
| 64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
|
| 66 |
-
##
|
| 67 |
|
| 68 |
```python
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
```
|
| 73 |
|
| 74 |
-
##
|
| 75 |
-
|
| 76 |
-
### Training Data
|
| 77 |
-
|
| 78 |
-
The model was trained on a dataset of 3,000 news articles from the following sources:
|
| 79 |
-
- nbcnews.com
|
| 80 |
-
- cnn.com
|
| 81 |
-
- cnbc.com
|
| 82 |
-
- apnews.com
|
| 83 |
-
- nytimes.com
|
| 84 |
-
- washingtonpost.com
|
| 85 |
-
|
| 86 |
-
Each article was annotated for media framing according to Entman's theory.
|
| 87 |
-
|
| 88 |
-
### Training Procedure
|
| 89 |
-
|
| 90 |
-
- Preprocessing: Tokenization using `bert-base-uncased` tokenizer
|
| 91 |
-
- Loss Function: CrossEntropyLoss
|
| 92 |
-
- Optimizer: AdamW
|
| 93 |
-
- Batch Size: 16
|
| 94 |
-
- Epochs: 4
|
| 95 |
-
- Learning Rate: 2e-5
|
| 96 |
-
- Precision: fp16 mixed precision
|
| 97 |
-
|
| 98 |
-
## Evaluation
|
| 99 |
-
|
| 100 |
-
### Testing Data, Factors & Metrics
|
| 101 |
-
|
| 102 |
-
#### Testing Data
|
| 103 |
-
|
| 104 |
-
A hold-out test set consisting of 3000 news articles from the same sources as training.
|
| 105 |
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
#### Metrics
|
| 111 |
-
|
| 112 |
-
- Accuracy
|
| 113 |
-
- F1-Score (macro average)
|
| 114 |
-
|
| 115 |
-
### Results
|
| 116 |
-
|
| 117 |
-
- Accuracy: 0.2400
|
| 118 |
-
- Macro F1: 0.6386
|
| 119 |
-
|
| 120 |
-
## Environmental Impact
|
| 121 |
-
|
| 122 |
-
- **Hardware Type:** Google Colab
|
| 123 |
-
- **Hours used:** ~2 hours
|
| 124 |
-
- **Cloud Provider:** Google Cloud
|
| 125 |
-
- **Compute Region:** US
|
| 126 |
-
- **Carbon Emitted:** Estimated via [ML CO2 calculator](https://mlco2.github.io/impact#compute)
|
| 127 |
-
|
| 128 |
-
## Technical Specifications
|
| 129 |
|
| 130 |
-
|
|
|
|
| 131 |
|
| 132 |
-
|
| 133 |
|
| 134 |
-
|
|
|
|
|
|
|
|
|
|
| 135 |
|
| 136 |
-
|
| 137 |
-
|
|
|
|
|
|
|
| 138 |
|
| 139 |
## Citation
|
| 140 |
|
| 141 |
-
|
|
|
|
| 142 |
```bibtex
|
| 143 |
@misc{nurdyansa_2025,
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
}
|
| 151 |
```
|
| 152 |
|
| 153 |
-
|
| 154 |
-
|
|
|
|
| 155 |
|
| 156 |
-
|
|
|
|
|
|
|
| 157 |
|
| 158 |
-
|
|
|
|
| 35 |
- **License:** MIT
|
| 36 |
- **Finetuned from model:** `bert-base-uncased`
|
| 37 |
|
|
|
|
| 38 |
|
| 39 |
+
# Framing-BERT: Multi-label Classification for Framing Elements Detection
|
|
|
|
| 40 |
|
| 41 |
+
This repository provides a fine-tuned BERT-based model for detecting framing elements in text, particularly based on the four key elements identified by Entman (1993). The model is designed to classify multiple framing elements simultaneously using multi-label classification.
|
| 42 |
|
| 43 |
+
## Framing Elements
|
| 44 |
|
| 45 |
+
The model predicts the presence of the following **framing elements**, derived from Entman's framing theory:
|
| 46 |
|
| 47 |
+
- **`define_problem`** – Indicates whether the text defines a social or political problem.
|
| 48 |
+
- **`diagnose_cause`** – Detects if the text attributes causes or sources for the issue.
|
| 49 |
+
- **`moral_judgment`** – Identifies normative or value-laden evaluations.
|
| 50 |
+
- **`suggest_remedy`** – Indicates whether the text proposes solutions, actions, or remedies.
|
| 51 |
|
| 52 |
+
These categories correspond to Entman’s (1993) core functions of framing: defining problems, diagnosing causes, making moral judgments, and suggesting remedies.
|
| 53 |
|
| 54 |
+
## Model Information
|
| 55 |
|
| 56 |
+
- **Base model**: `bert-base-uncased`
|
| 57 |
+
- **Fine-tuned task**: Multi-label text classification
|
| 58 |
+
- **Number of labels**: 4
|
| 59 |
+
- **Best performance (F1 Macro)**: `0.6386`
|
| 60 |
|
| 61 |
+
## Training Details
|
| 62 |
|
| 63 |
+
- **Optimizer**: AdamW
|
| 64 |
+
- **Scheduler**: Linear with warmup
|
| 65 |
+
- **Hyperparameter search**: Optuna
|
| 66 |
+
- **Best hyperparameters**:
|
| 67 |
+
- `learning_rate`: 4.24e-05
|
| 68 |
+
- `weight_decay`: 0.222
|
| 69 |
+
- `num_train_epochs`: 3
|
| 70 |
|
| 71 |
+
### Metrics per Epoch (Best Run)
|
| 72 |
|
| 73 |
+
| Epoch | Train Loss | Val Loss | Accuracy | F1 Macro | Precision Macro | Recall Macro |
|
| 74 |
+
|-------|------------|----------|----------|----------|------------------|----------------|
|
| 75 |
+
| 1 | 0.6594 | 0.6530 | 0.220 | 0.6758 | 0.5775 | 0.8178 |
|
| 76 |
+
| 2 | 0.6027 | 0.6355 | 0.2317 | 0.6329 | 0.6204 | 0.6490 |
|
| 77 |
+
| 3 | 0.5577 | 0.6404 | 0.2283 | 0.6386 | 0.6280 | 0.6566 |
|
| 78 |
|
| 79 |
+
## Example Prediction
|
| 80 |
|
| 81 |
```python
|
| 82 |
+
Predicted Labels:
|
| 83 |
+
{
|
| 84 |
+
"define_problem": True,
|
| 85 |
+
"diagnose_cause": True,
|
| 86 |
+
"moral_judgment": True,
|
| 87 |
+
"suggest_remedy": True
|
| 88 |
+
}
|
| 89 |
```
|
| 90 |
|
| 91 |
+
## How to Use
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 92 |
|
| 93 |
+
```python
|
| 94 |
+
from transformers import BertTokenizerFast, BertForSequenceClassification
|
| 95 |
+
import torch
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
|
| 97 |
+
tokenizer = BertTokenizerFast.from_pretrained("nurdyansa/framing-bert-model")
|
| 98 |
+
model = BertForSequenceClassification.from_pretrained("nurdyansa/framing-bert-model")
|
| 99 |
|
| 100 |
+
text = "Government policies have led to an increase in unemployment, which is unacceptable and must be addressed through immediate reform."
|
| 101 |
|
| 102 |
+
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
|
| 103 |
+
with torch.no_grad():
|
| 104 |
+
logits = model(**inputs).logits
|
| 105 |
+
predictions = torch.sigmoid(logits).squeeze().tolist()
|
| 106 |
|
| 107 |
+
labels = ["define_problem", "diagnose_cause", "moral_judgment", "suggest_remedy"]
|
| 108 |
+
predicted_labels = {label: pred > 0.5 for label, pred in zip(labels, predictions)}
|
| 109 |
+
print(predicted_labels)
|
| 110 |
+
```
|
| 111 |
|
| 112 |
## Citation
|
| 113 |
|
| 114 |
+
If you use this model in your research or application, please cite it as:
|
| 115 |
+
|
| 116 |
```bibtex
|
| 117 |
@misc{nurdyansa_2025,
|
| 118 |
+
author = { Nurdyansa },
|
| 119 |
+
title = { framing-bert-model (Revision f03db73) },
|
| 120 |
+
year = 2025,
|
| 121 |
+
url = { https://huggingface.co/nurdyansa/framing-bert-model },
|
| 122 |
+
doi = { 10.57967/hf/5387 },
|
| 123 |
+
publisher = { Hugging Face }
|
| 124 |
}
|
| 125 |
```
|
| 126 |
|
| 127 |
+
## Contributing
|
| 128 |
+
|
| 129 |
+
I'm very welcome to invite researchers and practitioners to collaborate in enhancing this model’s precision. Please contribute by:
|
| 130 |
|
| 131 |
+
- Providing more annotated data.
|
| 132 |
+
- Improving label consistency or adding nuance.
|
| 133 |
+
- Suggesting improvements to model architecture or training methods.
|
| 134 |
|
| 135 |
+
Together, we can build a more accurate framing analysis tool. Ifyou wnat to ask please mail me to nurdyansa@gmail.com
|