eevvgg
/

StanceBERTa

Text Classification

Model card Files Files and versions

StanceBERTa / README.md

eevvgg's picture

up read prev model url

1f68e8e almost 3 years ago

|

history blame contribute delete

3.12 kB

	---
	tags:
	- text
	- stance
	language:
	- en
	metrics:
	- f1
	- accuracy
	pipeline_tag: text-classification

	widget:
	- text: user Bolsonaro is the president of Brazil. He speaks for all brazilians. Greta is a climate activist. Their opinions do create a balance that the world needs now
	example_title: example 1
	- text: user The fact is that she still doesn’t change her ways and still stays non environmental friendly
	example_title: example 2
	- text: user The criteria for these awards dont seem to be very high.
	example_title: example 3

	model-index:
	- name: StanceBERTa
	results:
	- task:
	type: text-classification
	name: Text Classification # Optional. Example: Speech Recognition
	dataset:
	type: social media # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
	name: unpublished # Required. A pretty name for the dataset. Example: Common Voice (French)
	metrics:
	- type: f1
	value: 77.8
	- type: accuracy
	value: 78.5
	---

	# eevvgg/StanceBERTa

	<!-- Provide a quick summary of what the model is/does. -->

	This model is a fine-tuned version of distilroberta-base model to predict 3 categories of stance (negative, positive, neutral) towards some entity mentioned in the text.
	Fine-tuned on a larger and more balanced data sample compared with the previous version [eevvgg/Stance-Tw](https://huggingface.co/eevvgg/Stance-Tw).


	- Developed by: Ewelina Gajewska

	- Model type: RoBERTa for stance classification
	- Language(s) (NLP): English social media data from Twitter and Reddit
	- Finetuned from model: [distilroberta-base](distilroberta-base)


	## Uses

	```
	from transformers import pipeline

	model_path = "eevvgg/StanceBERTa"
	cls_task = pipeline(task = "text-classification", model = model_path, tokenizer = model_path)#, device=0

	sequence = ["user The fact is that she still doesn’t change her ways and still stays non environmental friendly"
	"user The criteria for these awards dont seem to be very high."]

	result = cls_task(sequence)

	```

	Model suited for classification of stance in short text. Fine-tuned on a balanced corpus of size 5.6k, partially semi-annotated.
	*Suitable for fine-tuning on hate/offensive language detection.

	## Model Sources

	<!-- Provide the basic links for the model. -->

	- Repository: training procedure available in [Colab notebook](https://colab.research.google.com/drive/1-C47Ei7vgYtcfLLBB_Vkm3nblE5zH-aL?usp=sharing)
	- Paper : tba


	## Training Details

	### Preprocessing

	Normalization of user mentions and hyperlinks to "@user" and "http" tokens, respectively.

	### Training Hyperparameters

	- trained for 3 epochs, mini-batch size of 8.
	- loss: 0.509
	- learning_rate: 5e-5; weight_decay: 1e-2

	## Evaluation

	### Results

	- evaluation on 15% of data.

	- accuracy: 0.785
	- macro avg:
	- f1: 0.778
	- precision: 0.779
	- recall: 0.778
	- weighted avg:
	- f1: 0.786
	- precision: 0.786
	- recall: 0.785


	## Citation

	BibTeX: tba