metadata
license: cc-by-nc-sa-4.0
datasets:
- franciellevargas/HateBR
language:
- pt
metrics:
- accuracy
- f1
- recall
- roc_auc
base_model:
- neuralmind/bert-base-portuguese-cased
tags:
- hate_speech
- algorithmic_bias
bertimbau-hate-detector
This model is a fine-tuned version of neuralmind/bert-base-portuguese-cased.
Model description
This model is able to detect hate speech in Brazilian Portugues. The following metrics were achieved in validation set:
- accuracy 90%
- f1 91%
- auc 96%
- recall 92%
Intended uses & limitations
The experiments conduced for assessment of racial bias shows that this model can perpetuate racial bias based on brazilian portuguese dialect called "pretuguês".
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-4
- train_batch_size: 16
- eval_batch_size: 16
- optimizer: Adam
- num_epochs: 20
Usage
from transformers import AutoTokenizer # Or BertTokenizer
from transformers import AutoModelForPreTraining # Or BertForPreTraining for loading pretraining heads
from transformers import AutoModel # or BertModel, for BERT without pretraining heads
model = AutoModelForPreTraining.from_pretrained('cassiasilvaR/bertimbau-hate-detector')
tokenizer = AutoTokenizer.from_pretrained('cassiasilvaR/bertimbau-hate-detector', do_lower_case=False)