Hyperparameter Search Results for BRlkl/BingoGuard-bert-base-pt-optimized
This model was selected through a randomized search over a hyperparameter space. The best performing model was chosen based on the highest average F1 score across multiple benchmark datasets.
Best Hyperparameters
learning_rate: 5e-05
per_device_train_batch_size: 32
num_train_epochs: 8
weight_decay: 0.05
lr_scheduler_type: cosine
Final Benchmark Results (Full Dataset Training)
| dataset | accuracy | f1_score | recall | precision |
|---|---|---|---|---|
| BRlkl/BingoGuard-train-test-pt | 0.897773 | 0.946133 | 0.897773 | 1 |
| BRlkl/openai-moderation-eval-pt | 0.69881 | 0.651994 | 0.908046 | 0.508584 |
| BRlkl/WildGuardTest-pt | 0.839317 | 0.810021 | 0.771883 | 0.852123 |
| BRlkl/XSTest-pt | 0.831111 | 0.824885 | 0.895 | 0.764957 |
| BRlkl/toxic-chat-pt (40% holdout) | 0.97296 | 0.836795 | 0.927632 | 0.762162 |