bert-base-detect-jailbreak
This model is a fine-tuned version of bert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.3486
- Accuracy: 0.8931
- Precision: 0.9206
- Recall: 0.8657
- F1: 0.8923
- Balanced Accuracy: 0.8938
- Mcc: 0.7879
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 | Balanced Accuracy | Mcc |
|---|---|---|---|---|---|---|---|---|---|
| No log | 1.0 | 99 | 0.2730 | 0.9059 | 0.9305 | 0.8788 | 0.9039 | 0.9061 | 0.8130 |
| 0.4532 | 2.0 | 198 | 0.2610 | 0.9059 | 0.9548 | 0.8535 | 0.9013 | 0.9063 | 0.8165 |
| 0.2683 | 3.0 | 297 | 0.2622 | 0.9008 | 0.9441 | 0.8535 | 0.8966 | 0.9011 | 0.8054 |
| 0.202 | 4.0 | 396 | 0.2914 | 0.9109 | 0.9179 | 0.9040 | 0.9109 | 0.9110 | 0.8220 |
| 0.1308 | 5.0 | 495 | 0.3012 | 0.9135 | 0.9362 | 0.8889 | 0.9119 | 0.9137 | 0.8281 |
| 0.0856 | 6.0 | 594 | 0.3709 | 0.8906 | 0.8818 | 0.9040 | 0.8928 | 0.8905 | 0.7814 |
| 0.0622 | 7.0 | 693 | 0.4141 | 0.8957 | 0.8905 | 0.9040 | 0.8972 | 0.8956 | 0.7914 |
| 0.0366 | 8.0 | 792 | 0.4711 | 0.8957 | 0.8720 | 0.9293 | 0.8998 | 0.8954 | 0.7930 |
| 0.0262 | 9.0 | 891 | 0.4318 | 0.8982 | 0.8990 | 0.8990 | 0.8990 | 0.8982 | 0.7964 |
| 0.0145 | 10.0 | 990 | 0.4440 | 0.8957 | 0.8867 | 0.9091 | 0.8978 | 0.8956 | 0.7916 |
Framework versions
- Transformers 4.53.3
- Pytorch 2.6.0+cu124
- Datasets 4.3.0
- Tokenizers 0.21.4
- Downloads last month
- 1
Model tree for hurtmongoose/bert-base-detect-jailbreak
Base model
google-bert/bert-base-uncased