Upload benchmark_results.md with huggingface_hub
Browse files- benchmark_results.md +6 -6
benchmark_results.md
CHANGED
|
@@ -8,7 +8,7 @@ The best performing model was chosen based on the highest average F1 score acros
|
|
| 8 |
```
|
| 9 |
learning_rate: 5e-05
|
| 10 |
per_device_train_batch_size: 32
|
| 11 |
-
num_train_epochs:
|
| 12 |
weight_decay: 0.05
|
| 13 |
lr_scheduler_type: cosine
|
| 14 |
```
|
|
@@ -17,8 +17,8 @@ lr_scheduler_type: cosine
|
|
| 17 |
|
| 18 |
| dataset | accuracy | f1_score | recall | precision |
|
| 19 |
|:----------------------------------|-----------:|-----------:|---------:|------------:|
|
| 20 |
-
| BRlkl/BingoGuard-train-test-pt | 0.
|
| 21 |
-
| BRlkl/openai-moderation-eval-pt | 0.
|
| 22 |
-
| BRlkl/WildGuardTest-pt | 0.
|
| 23 |
-
| BRlkl/XSTest-pt | 0.
|
| 24 |
-
| BRlkl/toxic-chat-pt (40% holdout) | 0.
|
|
|
|
| 8 |
```
|
| 9 |
learning_rate: 5e-05
|
| 10 |
per_device_train_batch_size: 32
|
| 11 |
+
num_train_epochs: 10
|
| 12 |
weight_decay: 0.05
|
| 13 |
lr_scheduler_type: cosine
|
| 14 |
```
|
|
|
|
| 17 |
|
| 18 |
| dataset | accuracy | f1_score | recall | precision |
|
| 19 |
|:----------------------------------|-----------:|-----------:|---------:|------------:|
|
| 20 |
+
| BRlkl/BingoGuard-train-test-pt | 0.868421 | 0.929577 | 0.868421 | 1 |
|
| 21 |
+
| BRlkl/openai-moderation-eval-pt | 0.71131 | 0.655784 | 0.885057 | 0.520857 |
|
| 22 |
+
| BRlkl/WildGuardTest-pt | 0.837552 | 0.803698 | 0.749337 | 0.866564 |
|
| 23 |
+
| BRlkl/XSTest-pt | 0.877778 | 0.864865 | 0.88 | 0.850242 |
|
| 24 |
+
| BRlkl/toxic-chat-pt (40% holdout) | 0.974435 | 0.843373 | 0.921053 | 0.777778 |
|