1bcc5bb289b49178c8102d291edd5634
This model is a fine-tuned version of FacebookAI/roberta-base on the nyu-mll/glue [stsb] dataset. It achieves the following results on the evaluation set:
- Loss: 0.4738
- Data Size: 1.0
- Epoch Runtime: 15.3219
- Mse: 0.4740
- Mae: 0.5264
- R2: 0.7879
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Mse | Mae | R2 |
|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 6.9295 | 0 | 1.6349 | 6.9307 | 2.2025 | -2.1004 |
| No log | 1 | 179 | 5.4552 | 0.0078 | 2.0674 | 5.4564 | 1.9334 | -1.4408 |
| No log | 2 | 358 | 2.1803 | 0.0156 | 2.0406 | 2.1812 | 1.2750 | 0.0243 |
| No log | 3 | 537 | 2.1721 | 0.0312 | 2.4043 | 2.1730 | 1.2713 | 0.0279 |
| No log | 4 | 716 | 2.3324 | 0.0625 | 2.9166 | 2.3332 | 1.2777 | -0.0437 |
| No log | 5 | 895 | 1.5504 | 0.125 | 3.6610 | 1.5505 | 1.0381 | 0.3064 |
| 0.1407 | 6 | 1074 | 0.7633 | 0.25 | 5.2091 | 0.7634 | 0.6882 | 0.6585 |
| 0.6503 | 7 | 1253 | 0.5737 | 0.5 | 8.4262 | 0.5740 | 0.6077 | 0.7432 |
| 0.5047 | 8.0 | 1432 | 0.4593 | 1.0 | 15.0540 | 0.4596 | 0.5177 | 0.7944 |
| 0.3605 | 9.0 | 1611 | 0.4260 | 1.0 | 14.5109 | 0.4261 | 0.4844 | 0.8094 |
| 0.2824 | 10.0 | 1790 | 0.4959 | 1.0 | 14.9747 | 0.4961 | 0.5516 | 0.7781 |
| 0.2249 | 11.0 | 1969 | 0.5420 | 1.0 | 14.7415 | 0.5420 | 0.5667 | 0.7575 |
| 0.1729 | 12.0 | 2148 | 0.4392 | 1.0 | 15.0047 | 0.4394 | 0.4925 | 0.8035 |
| 0.1521 | 13.0 | 2327 | 0.4738 | 1.0 | 15.3219 | 0.4740 | 0.5264 | 0.7879 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.3.0
- Tokenizers 0.22.1
- Downloads last month
- -
Model tree for contemmcm/1bcc5bb289b49178c8102d291edd5634
Base model
FacebookAI/roberta-base