e51f9e2db13bbd7d0f7d3d5e84c3ab24

This model is a fine-tuned version of google-bert/bert-base-cased on the nyu-mll/glue [stsb] dataset. It achieves the following results on the evaluation set:

Loss: 0.6319
Data Size: 1.0
Epoch Runtime: 10.1508
Mse: 0.6322
Mae: 0.5956
R2: 0.7172

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Mse	Mae	R2
No log	0	0	7.9583	0	1.2341	7.9596	2.3816	-2.5606
No log	1	179	5.4167	0.0078	1.5089	5.4179	1.9252	-1.4236
No log	2	358	3.2484	0.0156	1.5355	3.2494	1.5146	-0.4536
No log	3	537	2.3488	0.0312	1.9320	2.3496	1.2965	-0.0511
No log	4	716	1.5684	0.0625	2.2067	1.5690	1.0375	0.2981
No log	5	895	1.2524	0.125	2.6960	1.2528	0.9265	0.4396
0.1255	6	1074	0.7943	0.25	3.8428	0.7947	0.7140	0.6445
0.7893	7	1253	0.8066	0.5	5.9542	0.8070	0.7059	0.6390
0.5936	8.0	1432	0.6476	1.0	10.4701	0.6479	0.6121	0.7102
0.4098	9.0	1611	0.6429	1.0	10.6566	0.6432	0.6071	0.7123
0.2866	10.0	1790	0.6838	1.0	10.7071	0.6840	0.6174	0.6940
0.2016	11.0	1969	0.6318	1.0	10.9308	0.6321	0.5992	0.7173
0.1634	12.0	2148	0.6374	1.0	10.1043	0.6378	0.6097	0.7147
0.137	13.0	2327	0.6114	1.0	10.1973	0.6117	0.5897	0.7264
0.1204	14.0	2506	0.6128	1.0	10.1341	0.6130	0.5818	0.7258
0.1038	15.0	2685	0.6647	1.0	10.1180	0.6651	0.6040	0.7025
0.1055	16.0	2864	0.6246	1.0	10.2543	0.6249	0.6046	0.7205
0.0913	17.0	3043	0.6319	1.0	10.1508	0.6322	0.5956	0.7172

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.3.0
Tokenizers 0.22.1

Downloads last month: 5

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for contemmcm/e51f9e2db13bbd7d0f7d3d5e84c3ab24

Base model

google-bert/bert-base-cased

Finetuned

(2924)

this model