GateNLP
/

stance-twitter-xlm-multilingual

Text Classification

Model card Files Files and versions

jvasilakes commited on Apr 15, 2025

Commit

6fdad75

·

verified ·

1 Parent(s): c70968a

Update README.md

Files changed (1) hide show

README.md +38 -3

README.md CHANGED Viewed

@@ -1,3 +1,38 @@
----
-license: mit
----

+---
+license: mit
+language:
+- cs
+- en
+- es
+- fr
+- hi
+- pt
+base_model:
+- cardiffnlp/twitter-xlm-roberta-large-2022
+pipeline_tag: text-classification
+---
+# Model Overview
+This model is for stance classification on source-reply tweet pairs from Twitter/X.
+It was fine-tuned on the training split of RumourEval2019 alongside synthetic tweets generated by [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
+The starting point for fine-tuning was [cardiffnlp/twitter-xlm-roberta-large-2022](https://huggingface.co/cardiffnlp/twitter-xlm-roberta-large-2022).
+## Synthetic Tweet Data
+The synthetic data was generated in Czech, English, Spanish, French, Hindi, and Portuguese.
+## Usage
+```python
+model_path = "GateNLP/stance-twitter-xlm-multilingual"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+    model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=4)
+source_tweet = "The Golden State Fence Company, hired to build part of the US-Mexico border wall, was fined $5 million for hiring illegal immigrant workers."
+reply_tweet = "@USER When did this happen?"
+inputs = tokenizer(text=tweet, text_pair=reply_tweet, return_tensors="pt")
+outputs = model(**inputs)
+labels = ["support", "deny", "query", "comment"]
+prediction = labels[outputs.logits.argmax()]
+```