Model Card for policlim

Model Description

This model detects climate change salience in (political) text. It fine-tunes base XLM-roberta using 3,434 manually annotated quasi-sentences from political manifestos (retrieved from the Manifesteo Project Database) to detect climate change salience. The model achieves a validation F1 score of .935 and accuracy of .957.

We have used the model to classify the climate change salience of political manifestos, the first step of which is detailed in the working paper below. The paper contains all relevant details of the training set, procedure, and evaluation of the model and final dataset.

Citation Information

@techreport{sanford2024policlim,
    title={Policlim: A Dataset of Climate Change Discourse in the Political Manifestos of 45 Countries from 1990-2022},
    author={Sanford, Mary and Pianta, Silvia and Schmid, Nicolas and Musto, Giorgio},
    type={Working paper},
    doi={https://osf.io/preprints/osf/bq356_v4},
    year={2025}
}

How to get started with the model

You can use the model for text classification, or use it as a base model to fine-tune for additional tasks. The simpletransformers package makes this process very straightforward.

import simpletransformers
from simpletransformers.classification import ClassificationModel, ClassificationArgs

## To use for climate change salience detection:

# Load target data in whatever format preferred.
data = pd.read_csv('your_data.csv')

model = ClassificationModel(
     model_type = "xlmroberta", model_name = 'policlim'
 )

preds,output = model.predict(data['text'].tolist())

## To use for further fine-tuning
from sklearn.metrics import f1_score, precision, accuracy, recall

# Load training data. Need to have text in 'text' field and corresponding labels in 'labels' field.
new_train = pd.read_csv('your_new_train_data.csv')
new_test = pd.read_csv('your_new_test_data.csv')
new_eval = pd.read_csv('your_new_eval_data.csv')

# Initialize the model with the updated arguments
model = ClassificationModel(
    model_type="xlmroberta", 
    model_name="policlim",  
    num_labels=2,                 # Number of labels for the new task
#    args=model_args,             # Update arguments (labels, hyperparameters, processing details, model evaluation preferences) as necessary
#    weight = weights,            # For class weights   
    ignore_mismatched_sizes=True, # Required if new task has labels other than 2 
    use_cuda=True
)

# Train the model
model.train_model(train_df = new_train, eval_df = new_test,
                  f1_train = f1_score(labels, preds,average=None) # You can also add your own evaluation metrics
                  )

# Evaluate the model
result, model_outputs, wrong_predictions = model.eval_model(val_df,
                                                            f1_eval = f1_score(labels, preds,average=None),
                                                            precision = precision(labels, preds,average=None),
                                                            recall = recall(labels, preds,average=None),
                                                            acc = accuracy_score(labels, preds,average=None)
                                                            )

print('\n\nThese are the results when testing the model on the test data set:\n')
print(result)

Model Sources

Model Card Authors

Mary Sanford, mary.sanford@cmcc.it

Downloads last month
9
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for marysanford/policlim

Finetuned
(3774)
this model