File size: 3,123 Bytes
3329ffc c95b433 3329ffc 62d3dae 3329ffc 62d3dae 3329ffc c95b433 3329ffc 62d3dae 1f68e8e 3329ffc 62d3dae 3329ffc 34b5713 3329ffc c4be4cf 3329ffc ca3d818 c4be4cf 3329ffc 1974878 3329ffc 62d3dae 3329ffc 62d3dae 3329ffc 62d3dae 3329ffc 62d3dae 3329ffc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
---
tags:
- text
- stance
language:
- en
metrics:
- f1
- accuracy
pipeline_tag: text-classification
widget:
- text: user Bolsonaro is the president of Brazil. He speaks for all brazilians. Greta is a climate activist. Their opinions do create a balance that the world needs now
example_title: example 1
- text: user The fact is that she still doesn’t change her ways and still stays non environmental friendly
example_title: example 2
- text: user The criteria for these awards dont seem to be very high.
example_title: example 3
model-index:
- name: StanceBERTa
results:
- task:
type: text-classification
name: Text Classification # Optional. Example: Speech Recognition
dataset:
type: social media # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
name: unpublished # Required. A pretty name for the dataset. Example: Common Voice (French)
metrics:
- type: f1
value: 77.8
- type: accuracy
value: 78.5
---
# eevvgg/StanceBERTa
<!-- Provide a quick summary of what the model is/does. -->
This model is a fine-tuned version of **distilroberta-base** model to predict 3 categories of stance (negative, positive, neutral) towards some entity mentioned in the text.
Fine-tuned on a larger and more balanced data sample compared with the previous version [eevvgg/Stance-Tw](https://huggingface.co/eevvgg/Stance-Tw).
- **Developed by:** Ewelina Gajewska
- **Model type:** RoBERTa for stance classification
- **Language(s) (NLP):** English social media data from Twitter and Reddit
- **Finetuned from model:** [distilroberta-base](distilroberta-base)
## Uses
```
from transformers import pipeline
model_path = "eevvgg/StanceBERTa"
cls_task = pipeline(task = "text-classification", model = model_path, tokenizer = model_path)#, device=0
sequence = ["user The fact is that she still doesn’t change her ways and still stays non environmental friendly"
"user The criteria for these awards dont seem to be very high."]
result = cls_task(sequence)
```
Model suited for classification of stance in short text. Fine-tuned on a balanced corpus of size 5.6k, partially semi-annotated.
*Suitable for fine-tuning on hate/offensive language detection.
## Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** training procedure available in [Colab notebook](https://colab.research.google.com/drive/1-C47Ei7vgYtcfLLBB_Vkm3nblE5zH-aL?usp=sharing)
- **Paper :** tba
## Training Details
### Preprocessing
Normalization of user mentions and hyperlinks to "@user" and "http" tokens, respectively.
### Training Hyperparameters
- trained for 3 epochs, mini-batch size of 8.
- loss: 0.509
- learning_rate: 5e-5; weight_decay: 1e-2
## Evaluation
### Results
- evaluation on 15% of data.
- accuracy: 0.785
- macro avg:
- f1: 0.778
- precision: 0.779
- recall: 0.778
- weighted avg:
- f1: 0.786
- precision: 0.786
- recall: 0.785
## Citation
**BibTeX:** tba
|