bert-base-Luis-Suarez

This model is a fine-tuned version of google-bert/bert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 2
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
1.3492	0.0433	10	1.3237
1.3458	0.0866	20	1.2776
1.2506	0.1299	30	1.1714
1.2344	0.1732	40	1.1390
1.1679	0.2165	50	1.2270
1.1133	0.2597	60	1.0504
1.0285	0.3030	70	1.0773
0.9797	0.3463	80	1.0742
0.9846	0.3896	90	0.9647
1.0115	0.4329	100	1.0241
1.0053	0.4762	110	1.0325
0.9989	0.5195	120	0.9911
0.9252	0.5628	130	1.0684
1.034	0.6061	140	0.9263
0.9777	0.6494	150	0.9602
0.9752	0.6926	160	0.9061
0.9104	0.7359	170	0.9063
0.8962	0.7792	180	0.9016
0.9687	0.8225	190	0.9663
1.0488	0.8658	200	0.9008
0.8899	0.9091	210	0.8644
0.9072	0.9524	220	0.8653
0.8419	0.9957	230	0.8947
0.7472	1.0390	240	0.8731
0.6227	1.0823	250	0.9624
0.6734	1.1255	260	1.0027
0.7816	1.1688	270	0.8837
0.6177	1.2121	280	0.9030
0.5895	1.2554	290	0.9194
0.7603	1.2987	300	0.8774
0.7083	1.3420	310	0.8270
0.5926	1.3853	320	0.8414
0.672	1.4286	330	0.8468
0.6621	1.4719	340	0.8415
0.6322	1.5152	350	0.8204
0.606	1.5584	360	0.8284
0.5466	1.6017	370	0.8284
0.6634	1.6450	380	0.8335
0.592	1.6883	390	0.8258
0.5715	1.7316	400	0.8128
0.54	1.7749	410	0.8112
0.4777	1.8182	420	0.8126
0.6524	1.8615	430	0.8130
0.5118	1.9048	440	0.8122
0.5037	1.9481	450	0.8113
0.5577	1.9913	460	0.8114

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

(6306)

this model