thomasrenault/emotion

A multi-label emotion intensity classifier fine-tuned on US tweets, campaign speeches and congressional speeches. Built on distilbert-base-uncased with GPT-4o-mini annotation via the OpenAI Batch API.

Labels

The model predicts 8 independent emotion intensities (sigmoid, range 0–1):

| Label | |---|---| | anger | | sadness | | fear | | disgust | | pride | | joy | | gratitude | | hope |

Scores are independent — multiple emotions can be high simultaneously.

Training

Setting	Value
Base model	`distilbert-base-uncased`
Architecture	`DistilBertForSequenceClassification` (multi-label)
Problem type	`multi_label_classification`
Training data	~200,000 labeled documents
Annotation	GPT-4o-mini (temperature=0) via OpenAI Batch API
Epochs	4
Learning rate	2e-5
Batch size	16
Max length	512 tokens
Domain	US tweets about policy, campaign speeches and congressional floor speeches

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "thomasrenault/emotion"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model     = AutoModelForSequenceClassification.from_pretrained(model_id)
model.eval()

EMOTIONS = ["anger", "sadness", "fear", "disgust", "pride", "joy", "gratitude", "hope"]
THRESHOLD = 0.5

def predict(text):
    enc = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        probs = torch.sigmoid(model(**enc).logits).squeeze().tolist()
    matched = [t for t, p in zip(EMOTIONS, probs) if p >= THRESHOLD]
    return matched or ["no emotion"]


sentences = ["Enough lies, enough hypocrisy", "I'm so proud of our govenrment", "Climate change is a risk to our planet","Trump is the president of the US"]
for sentence in sentences:
    print(sentence, predict(sentence))

# Enough lies, enough hypocrisy ['anger']
# I'm so proud of our govenrment ['pride']
# Climate change is a risk to our planet ['fear']
# Trump is the president of the US ['no emotion']

Intended Use

Academic research on emotion in political communication
Analysis of congressional speeches and social media
Temporal trend analysis of emotional rhetoric

Limitations

Trained exclusively on US English political text — performance may degrade on other domains
Emotions are subjective; inter-annotator agreement on intensity scores is inherently noisy
Labels are silver-standard (LLM-generated), not human-verified gold labels

Citation

If you use this model, please cite https://socialeconomicslab.org/research/working-papers/emotions-and-policy/ :

@article{algan2026emotions,
  title={Emotions and policy views},
  author={Algan, Y., Davoine E., Renault, T. and Stantcheva, S.},
  year={2026}
}

Downloads last month: 11

Safetensors

Model size

67M params

Tensor type

F32

Model tree for thomasrenault/emotion

Base model

distilbert/distilbert-base-uncased

Finetuned

(11902)

this model