--- license: cc-by-nc-sa-4.0 datasets: - franciellevargas/HateBR language: - pt metrics: - accuracy - f1 - recall - roc_auc base_model: - neuralmind/bert-base-portuguese-cased tags: - hate_speech - algorithmic_bias --- # bertimbau-hate-detector This model is a fine-tuned version of [neuralmind/bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased). ## Model description This model is able to detect hate speech in Brazilian Portugues. The following metrics were achieved in validation set: - accuracy 90% - f1 91% - auc 96% - recall 92% ## Intended uses & limitations The experiments conduced for assessment of racial bias shows that this model can perpetuate racial bias based on brazilian portuguese dialect called "pretuguês". ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-4 - train_batch_size: 16 - eval_batch_size: 16 - optimizer: Adam - num_epochs: 20 ## Usage ```python from transformers import AutoTokenizer # Or BertTokenizer from transformers import AutoModelForPreTraining # Or BertForPreTraining for loading pretraining heads from transformers import AutoModel # or BertModel, for BERT without pretraining heads model = AutoModelForPreTraining.from_pretrained('cassiasilvaR/bertimbau-hate-detector') tokenizer = AutoTokenizer.from_pretrained('cassiasilvaR/bertimbau-hate-detector', do_lower_case=False) ```