---
license: mit
language:
- cs
- en
- es
- fr
- hi
- pt
base_model:
- cardiffnlp/twitter-xlm-roberta-large-2022
pipeline_tag: text-classification
---
# Model Overview

This model is for stance classification on source-reply tweet pairs from Twitter/X. 
It was fine-tuned on the training split of RumourEval2019 alongside synthetic tweets generated by [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
The starting point for fine-tuning was [cardiffnlp/twitter-xlm-roberta-large-2022](https://huggingface.co/cardiffnlp/twitter-xlm-roberta-large-2022).

## Synthetic Tweet Data

The synthetic data was generated in Czech, English, Spanish, French, Hindi, and Portuguese.

## Usage

```python
model_path = "GateNLP/stance-twitter-xlm-multilingual"
tokenizer = AutoTokenizer.from_pretrained(model_path)
    model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=4)

source_tweet = "The Golden State Fence Company, hired to build part of the US-Mexico border wall, was fined $5 million for hiring illegal immigrant workers."
reply_tweet = "@USER When did this happen?"
inputs = tokenizer(text=tweet, text_pair=reply_tweet, return_tensors="pt")
outputs = model(**inputs)

labels = ["support", "deny", "query", "comment"]
prediction = labels[outputs.logits.argmax()]
```