eevvgg
/

StanceBERTa

+---
+tags:
+  - text
+  - stance
+language:
+- en
+metrics:
+- f1
+- accuracy
+pipeline_tag: text-classification
+widget:
+- text: user Bolsonaro is the president of Brazil. He speaks for all brazilians. Greta is a climate activist. Their opinions do create a balance that the world needs now
+  example_title: example 1
+- text: user The fact is that she still doesn’t change her ways and still stays non environmental friendly
+  example_title: example 2
+- text: user The criteria for these awards dont seem to be very high.
+  example_title: example 3
+model-index:
+- name: StanBERT
+  results:
+  - task:
+      type: text-classification
+      name: Text Classification         # Optional. Example: Speech Recognition
+    dataset:
+      type: social media          # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
+      name: unpublished          # Required. A pretty name for the dataset. Example: Common Voice (French)
+    metrics:
+      - type: f1
+        value: 91.4
+      - type: accuracy
+        value: 91.2
+---
+# eevvgg/StanBERT
+<!-- Provide a quick summary of what the model is/does. -->
+This model is a fine-tuned version of [j-hartmann/sentiment-roberta-large-english-3-classes](https://huggingface.co/j-hartmann/sentiment-roberta-large-english-3-classes) to predict 3 categories of stance (negative, positive, neutral) towards some entity mentioned in the text.
+Fine-tuned on a larger and more balanced data sample compared with the previous version [eevvgg/BEtMan-Tw](https://huggingface.co/eevvgg/BEtMan-Tw).
+- **Developed by:** Ewelina Gajewska
+- **Model type:** RoBERTa for stance classification
+- **Language(s) (NLP):** English social media data from Twitter and Reddit
+- **Finetuned from model:** [j-hartmann/sentiment-roberta-large-english-3-classes](https://huggingface.co/j-hartmann/sentiment-roberta-large-english-3-classes)
+## Uses
+```
+from transformers import pipeline
+model_path = "eevvgg/StanBERT"
+cls_task = pipeline(task = "text-classification", model = model_path, tokenizer = model_path)#, device=0
+sequence = ['his rambling has no clear ideas behind it',
+            'That has nothing to do with medical care',
+            "Turns around and shows how qualified she is because of her political career.",
+            'She has very little to gain by speaking too much']
+result = cls_task(sequence)
+```
+Sentiment classification in multilingual data. Fine-tuned on a balanced corpus of size 8,4k, partially semi-annotated.
+Model suited for classification of stance in short text.
+## Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** training procedure available in [Colab notebook](https://colab.research.google.com/drive/1-C47Ei7vgYtcfLLBB_Vkm3nblE5zH-aL?usp=sharing)
+- **Paper :** tba
+## Training Details
+### Preprocessing
+Normalization of user mentions and hyperlinks to "user" and "url" tokens, respectively.
+### Training Hyperparameters
+- trained for 2 epochs, mini-batch size of 8.
+- loss: 0.574
+- learning_rate: 4e-5; weight_decay: 1e-2
+## Evaluation
+### Results
+- evaluation on 15% of data.
+- accuracy: 91.2
+- macro avg:
+  - f1: 91.4
+  - precision: 91.4
+  - recall: 91.5
+- weighted avg:
+  - f1: 91.2
+  - precision: 91.3
+  - recall: 91.2
+                       precision    recall  f1-score   support
+           neutral       0.930     0.868     0.898       471
+           positive      0.933     0.946     0.940       355
+           negative      0.878     0.931     0.904       433
+## Citation
+**BibTeX:** tba