BingoGuard-bert-base-pt-optimized / benchmark_results.md
BRlkl's picture
Upload benchmark_results.md with huggingface_hub
bde0da2 verified

Hyperparameter Search Results for BRlkl/BingoGuard-bert-base-pt-optimized

This model was selected through a randomized search over a hyperparameter space. The best performing model was chosen based on the highest average F1 score across multiple benchmark datasets.

Best Hyperparameters

learning_rate: 5e-05
per_device_train_batch_size: 32
num_train_epochs: 8
weight_decay: 0.05
lr_scheduler_type: cosine

Final Benchmark Results (Full Dataset Training)

dataset accuracy f1_score recall precision
BRlkl/BingoGuard-train-test-pt 0.897773 0.946133 0.897773 1
BRlkl/openai-moderation-eval-pt 0.69881 0.651994 0.908046 0.508584
BRlkl/WildGuardTest-pt 0.839317 0.810021 0.771883 0.852123
BRlkl/XSTest-pt 0.831111 0.824885 0.895 0.764957
BRlkl/toxic-chat-pt (40% holdout) 0.97296 0.836795 0.927632 0.762162