jvasilakes commited on
Commit
6fdad75
·
verified ·
1 Parent(s): c70968a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -3
README.md CHANGED
@@ -1,3 +1,38 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - cs
5
+ - en
6
+ - es
7
+ - fr
8
+ - hi
9
+ - pt
10
+ base_model:
11
+ - cardiffnlp/twitter-xlm-roberta-large-2022
12
+ pipeline_tag: text-classification
13
+ ---
14
+ # Model Overview
15
+
16
+ This model is for stance classification on source-reply tweet pairs from Twitter/X.
17
+ It was fine-tuned on the training split of RumourEval2019 alongside synthetic tweets generated by [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
18
+ The starting point for fine-tuning was [cardiffnlp/twitter-xlm-roberta-large-2022](https://huggingface.co/cardiffnlp/twitter-xlm-roberta-large-2022).
19
+
20
+ ## Synthetic Tweet Data
21
+
22
+ The synthetic data was generated in Czech, English, Spanish, French, Hindi, and Portuguese.
23
+
24
+ ## Usage
25
+
26
+ ```python
27
+ model_path = "GateNLP/stance-twitter-xlm-multilingual"
28
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
29
+ model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=4)
30
+
31
+ source_tweet = "The Golden State Fence Company, hired to build part of the US-Mexico border wall, was fined $5 million for hiring illegal immigrant workers."
32
+ reply_tweet = "@USER When did this happen?"
33
+ inputs = tokenizer(text=tweet, text_pair=reply_tweet, return_tensors="pt")
34
+ outputs = model(**inputs)
35
+
36
+ labels = ["support", "deny", "query", "comment"]
37
+ prediction = labels[outputs.logits.argmax()]
38
+ ```