cassiasilvaR's picture
Update README.md
357f08b verified
metadata
license: cc-by-nc-sa-4.0
datasets:
  - franciellevargas/HateBR
language:
  - pt
metrics:
  - accuracy
  - f1
  - recall
  - roc_auc
base_model:
  - neuralmind/bert-base-portuguese-cased
tags:
  - hate_speech
  - algorithmic_bias

bertimbau-hate-detector

This model is a fine-tuned version of neuralmind/bert-base-portuguese-cased.

Model description

This model is able to detect hate speech in Brazilian Portugues. The following metrics were achieved in validation set:

  • accuracy 90%
  • f1 91%
  • auc 96%
  • recall 92%

Intended uses & limitations

The experiments conduced for assessment of racial bias shows that this model can perpetuate racial bias based on brazilian portuguese dialect called "pretuguês".

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-4
  • train_batch_size: 16
  • eval_batch_size: 16
  • optimizer: Adam
  • num_epochs: 20

Usage

from transformers import AutoTokenizer  # Or BertTokenizer
from transformers import AutoModelForPreTraining  # Or BertForPreTraining for loading pretraining heads
from transformers import AutoModel  # or BertModel, for BERT without pretraining heads
model = AutoModelForPreTraining.from_pretrained('cassiasilvaR/bertimbau-hate-detector')
tokenizer = AutoTokenizer.from_pretrained('cassiasilvaR/bertimbau-hate-detector', do_lower_case=False)