| Training parameters: | |
| ``` | |
| model_args = ClassificationArgs() | |
| model_args.max_seq_length = 512 | |
| model_args.train_batch_size = 12 | |
| model_args.eval_batch_size = 12 | |
| model_args.num_train_epochs = 5 | |
| model_args.evaluate_during_training = False | |
| model_args.learning_rate = 1e-5 | |
| model_args.use_multiprocessing = False | |
| model_args.fp16 = False | |
| model_args.save_steps = -1 | |
| model_args.save_eval_checkpoints = False | |
| model_args.no_cache = True | |
| model_args.reprocess_input_data = True | |
| model_args.overwrite_output_dir = True | |
| ``` | |
| Evaluation on BoolQ Test Set: | |
| | | Precision | Recall | F1-score | | |
| |:------------:|:---------:|:------:|:--------:| | |
| | 0 | 0.82 | 0.80 | 0.81 | | |
| | 1 | 0.88 | 0.89 | 0.88 | | |
| | accuracy | | | 0.86 | | |
| | macro avg | 0.85 | 0.84 | 0.85 | | |
| | weighted avg | 0.86 | 0.86 | 0.86 | | |
| ROC AUC Score: 0.844 |