62570c130d9c027e2ca47babd21cadfa
This model is a fine-tuned version of openai-community/gpt2-medium on the nyu-mll/glue [qnli] dataset. It achieves the following results on the evaluation set:
- Loss: 0.4633
- Data Size: 1.0
- Epoch Runtime: 429.8114
- Accuracy: 0.8899
- F1 Macro: 0.8899
- Rouge1: 0.8899
- Rouge2: 0.0
- Rougel: 0.8901
- Rougelsum: 0.8901
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Accuracy | F1 Macro | Rouge1 | Rouge2 | Rougel | Rougelsum |
|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.8969 | 0 | 8.5703 | 0.5066 | 0.3431 | 0.5062 | 0.0 | 0.5064 | 0.5064 |
| No log | 1 | 3273 | 0.7073 | 0.0078 | 13.2616 | 0.5860 | 0.5237 | 0.5858 | 0.0 | 0.5855 | 0.5858 |
| 0.022 | 2 | 6546 | 0.5080 | 0.0156 | 15.7701 | 0.7616 | 0.7535 | 0.7616 | 0.0 | 0.7614 | 0.7612 |
| 0.5882 | 3 | 9819 | 0.4724 | 0.0312 | 22.9579 | 0.8057 | 0.8042 | 0.8053 | 0.0 | 0.8059 | 0.8057 |
| 0.4573 | 4 | 13092 | 0.4225 | 0.0625 | 35.7555 | 0.8156 | 0.8140 | 0.8154 | 0.0 | 0.8158 | 0.8154 |
| 0.3789 | 5 | 16365 | 0.3449 | 0.125 | 62.8776 | 0.8548 | 0.8546 | 0.8548 | 0.0 | 0.8550 | 0.8544 |
| 0.3552 | 6 | 19638 | 0.3032 | 0.25 | 116.2979 | 0.8796 | 0.8795 | 0.8794 | 0.0 | 0.8798 | 0.8796 |
| 0.2876 | 7 | 22911 | 0.3125 | 0.5 | 221.1460 | 0.8651 | 0.8644 | 0.8649 | 0.0 | 0.8653 | 0.8649 |
| 0.2608 | 8.0 | 26184 | 0.2901 | 1.0 | 432.1524 | 0.8820 | 0.8817 | 0.8822 | 0.0 | 0.8818 | 0.8822 |
| 0.1293 | 9.0 | 29457 | 0.3893 | 1.0 | 435.3946 | 0.8844 | 0.8842 | 0.8842 | 0.0 | 0.8847 | 0.8846 |
| 0.1096 | 10.0 | 32730 | 0.3718 | 1.0 | 434.1044 | 0.8919 | 0.8918 | 0.8919 | 0.0 | 0.8923 | 0.8919 |
| 0.0711 | 11.0 | 36003 | 0.4568 | 1.0 | 430.7896 | 0.8965 | 0.8965 | 0.8967 | 0.0 | 0.8967 | 0.8963 |
| 0.0902 | 12.0 | 39276 | 0.4633 | 1.0 | 429.8114 | 0.8899 | 0.8899 | 0.8899 | 0.0 | 0.8901 | 0.8901 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.3.0
- Tokenizers 0.22.1
- Downloads last month
- -
Model tree for contemmcm/62570c130d9c027e2ca47babd21cadfa
Base model
openai-community/gpt2-medium