--- license: mit language: - cs - en - es - fr - hi - pt base_model: - cardiffnlp/twitter-xlm-roberta-large-2022 pipeline_tag: text-classification --- # Model Overview This model is for stance classification on source-reply tweet pairs from Twitter/X. It was fine-tuned on the training split of RumourEval2019 alongside synthetic tweets generated by [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) The starting point for fine-tuning was [cardiffnlp/twitter-xlm-roberta-large-2022](https://huggingface.co/cardiffnlp/twitter-xlm-roberta-large-2022). ## Synthetic Tweet Data The synthetic data was generated in Czech, English, Spanish, French, Hindi, and Portuguese. ## Usage ```python model_path = "GateNLP/stance-twitter-xlm-multilingual" tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=4) source_tweet = "The Golden State Fence Company, hired to build part of the US-Mexico border wall, was fined $5 million for hiring illegal immigrant workers." reply_tweet = "@USER When did this happen?" inputs = tokenizer(text=tweet, text_pair=reply_tweet, return_tensors="pt") outputs = model(**inputs) labels = ["support", "deny", "query", "comment"] prediction = labels[outputs.logits.argmax()] ```