Model Card: modernbert-political-bias

Model Description

This is a fine-tuned version of the answerdotai/ModernBERT-base model for political bias classification. It has been trained to classify text into categories representing different political viewpoints.

Intended Uses

This model is intended for classifying the political bias of English text. It can be used in applications such as:

Analyzing political discourse.
Categorizing news articles or social media posts by bias.
Research on political polarization and language.

Training Data

The model was fine-tuned on the Faith1712/Allsides_political_bias_proper dataset. This dataset contains text samples labeled with different political biases.

Training samples: 13889
Validation samples: 1736
Test samples: 1737

Training Procedure

The model was fine-tuned using the following procedure:

Base Model: answerdotai/ModernBERT-base
Optimizer: AdamW
Learning Rate: 2e-5
Epochs: 1
Batch Size: 16
Max Sequence Length: 128

Evaluation Results

The model was evaluated on a held-out test set.

Validation Loss (after 1 epoch): 0.3688
Validation Accuracy (after 1 epoch): 0.8531
Final Test Accuracy: 0.8555

A comparison was also made with the bucketresearch/politicalBiasBERT model on the same test set:

Your Fine-Tuned ModernBERT Accuracy: 0.8555
bucketresearch/politicalBiasBERT Accuracy: 0.4588

Limitations and Bias

The model's performance is dependent on the quality and characteristics of the training data. It may exhibit biases present in the Faith1712/Allsides_political_bias_proper dataset.
The concept of "political bias" can be subjective and complex. The model's classification reflects the labels provided in the training data.
The model is trained on English text and may not perform well on other languages.

How to Use

You can load and use this model with the Hugging Face transformers library.

from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch

Replace "test0198/modernbert-political-bias" with your actual repository ID on the Hugging Face Hub

model_name = "test0198/modernbert-political-bias"

Load the tokenizer and model from the Hub

tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name)

Assuming you have a GPU available, move the model to the GPU

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model.to(device)

Example text you want to classify

text_to_classify = "This is an example sentence that we want to classify for political bias."

Tokenize the text

inputs = tokenizer(text_to_classify, return_tensors="pt", truncation=True, padding=True, max_length=128)

Move the input tensors to the same device as the model

inputs = {key: value.to(device) for key, value in inputs.items()}

Get the model's output (logits)

with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits

Get the predicted class index

predicted_class_id = torch.argmax(logits, dim=1).item()

Map the predicted class index to a label (you'll need to know the mapping from your training data)

For example, if 0=left, 1=center, 2=right, you would use a dictionary:

label_map = {0: "Left", 1: "Center", 2: "Right"} # Replace with your actual label mapping predicted_label = label_map.get(predicted_class_id, "Unknown")

print(f"The predicted political bias for the text is: {predicted_label}")

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support