eevvgg commited on
Commit
3329ffc
·
1 Parent(s): 1d686b1

create model card

Browse files
Files changed (1) hide show
  1. README.md +120 -0
README.md ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text
4
+ - stance
5
+ language:
6
+ - en
7
+ metrics:
8
+ - f1
9
+ - accuracy
10
+ pipeline_tag: text-classification
11
+
12
+ widget:
13
+ - text: user Bolsonaro is the president of Brazil. He speaks for all brazilians. Greta is a climate activist. Their opinions do create a balance that the world needs now
14
+ example_title: example 1
15
+ - text: user The fact is that she still doesn’t change her ways and still stays non environmental friendly
16
+ example_title: example 2
17
+ - text: user The criteria for these awards dont seem to be very high.
18
+ example_title: example 3
19
+
20
+ model-index:
21
+ - name: StanBERT
22
+ results:
23
+ - task:
24
+ type: text-classification
25
+ name: Text Classification # Optional. Example: Speech Recognition
26
+ dataset:
27
+ type: social media # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
28
+ name: unpublished # Required. A pretty name for the dataset. Example: Common Voice (French)
29
+ metrics:
30
+ - type: f1
31
+ value: 91.4
32
+ - type: accuracy
33
+ value: 91.2
34
+ ---
35
+
36
+ # eevvgg/StanBERT
37
+
38
+ <!-- Provide a quick summary of what the model is/does. -->
39
+
40
+ This model is a fine-tuned version of [j-hartmann/sentiment-roberta-large-english-3-classes](https://huggingface.co/j-hartmann/sentiment-roberta-large-english-3-classes) to predict 3 categories of stance (negative, positive, neutral) towards some entity mentioned in the text.
41
+ Fine-tuned on a larger and more balanced data sample compared with the previous version [eevvgg/BEtMan-Tw](https://huggingface.co/eevvgg/BEtMan-Tw).
42
+
43
+
44
+ - **Developed by:** Ewelina Gajewska
45
+
46
+ - **Model type:** RoBERTa for stance classification
47
+ - **Language(s) (NLP):** English social media data from Twitter and Reddit
48
+ - **Finetuned from model:** [j-hartmann/sentiment-roberta-large-english-3-classes](https://huggingface.co/j-hartmann/sentiment-roberta-large-english-3-classes)
49
+
50
+
51
+ ## Uses
52
+
53
+ ```
54
+ from transformers import pipeline
55
+
56
+ model_path = "eevvgg/StanBERT"
57
+ cls_task = pipeline(task = "text-classification", model = model_path, tokenizer = model_path)#, device=0
58
+
59
+ sequence = ['his rambling has no clear ideas behind it',
60
+ 'That has nothing to do with medical care',
61
+ "Turns around and shows how qualified she is because of her political career.",
62
+ 'She has very little to gain by speaking too much']
63
+
64
+ result = cls_task(sequence)
65
+
66
+ ```
67
+
68
+ Sentiment classification in multilingual data. Fine-tuned on a balanced corpus of size 8,4k, partially semi-annotated.
69
+ Model suited for classification of stance in short text.
70
+
71
+
72
+ ## Model Sources
73
+
74
+ <!-- Provide the basic links for the model. -->
75
+
76
+ - **Repository:** training procedure available in [Colab notebook](https://colab.research.google.com/drive/1-C47Ei7vgYtcfLLBB_Vkm3nblE5zH-aL?usp=sharing)
77
+ - **Paper :** tba
78
+
79
+
80
+ ## Training Details
81
+
82
+ ### Preprocessing
83
+
84
+ Normalization of user mentions and hyperlinks to "user" and "url" tokens, respectively.
85
+
86
+ ### Training Hyperparameters
87
+
88
+ - trained for 2 epochs, mini-batch size of 8.
89
+ - loss: 0.574
90
+ - learning_rate: 4e-5; weight_decay: 1e-2
91
+
92
+ ## Evaluation
93
+
94
+ ### Results
95
+
96
+ - evaluation on 15% of data.
97
+
98
+ - accuracy: 91.2
99
+ - macro avg:
100
+ - f1: 91.4
101
+ - precision: 91.4
102
+ - recall: 91.5
103
+ - weighted avg:
104
+ - f1: 91.2
105
+ - precision: 91.3
106
+ - recall: 91.2
107
+
108
+ precision recall f1-score support
109
+
110
+ neutral 0.930 0.868 0.898 471
111
+ positive 0.933 0.946 0.940 355
112
+ negative 0.878 0.931 0.904 433
113
+
114
+
115
+
116
+ ## Citation
117
+
118
+ **BibTeX:** tba
119
+
120
+