File size: 1,685 Bytes
ca6f8ab 3d1fc37 61ed48d 3d1fc37 61ed48d 3d1fc37 61ed48d 3d1fc37 61ed48d 3d1fc37 61ed48d 3d1fc37 61ed48d 3d1fc37 61ed48d 3d1fc37 61ed48d 3d1fc37 61ed48d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
---
license: mit
---
# Model Card for BERT hate offensive tweets
BERT base uncased trained on the data that can be found here: [MartynaKopyta/hate_offensive_tweets](https://huggingface.co/datasets/MartynaKopyta/hate_offensive_tweets) to classify tweets as 0 - hate, 1 - offensive or 2 - neither.
You can find the notebook used for training in my GitHub repo: [MartynaKopyta/BERT_FINE-TUNING](https://github.com/MartynaKopyta/BERT_FINE-TUNING/blob/main/BERT_hate_offensive_speech.ipynb).
## Model Details
- **Finetuned from model [bert-base-uncased](https://huggingface.co/bert-base-uncased)**
## Bias, Risks, and Limitations
The dataset was not big enough for BERT to learn to classify 3 classes accurately, it is right 3/4 times.
## How to Get Started with the Model
```
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained('MartynaKopyta/BERT_hate_offensive_tweets')
tokenizer = AutoTokenizer.from_pretrained('MartynaKopyta/BERT_hate_offensive_tweets')
```
#### Training Hyperparameters
- **batch size:16**
- **learning rate:2e-5**
- **epochs:3**
## Evaluation
```
Accuracy: 0.779373368146214
Classification Report:
precision recall f1-score support
0 0.74 0.68 0.71 1532
1 0.85 0.88 0.87 1532
2 0.74 0.78 0.76 1532
accuracy 0.78 4596
macro avg 0.78 0.78 0.78 4596
weighted avg 0.78 0.78 0.78 4596
Confusion Matrix:
[[1043 96 393]
[ 169 1343 20]
[ 204 132 1196]]
MCC: 0.670
``` |