bert-base-cased

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1496
Precision: 0.8118
Recall: 0.8887
F1: 0.8485
Accuracy: 0.9738

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.5
num_epochs: 44

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.0	20	2.2345	0.0029	0.0183	0.0049	0.4324
No log	2.0	40	1.5703	0.0	0.0	0.0	0.7638
No log	3.0	60	0.9037	0.0	0.0	0.0	0.7730
No log	4.0	80	0.6507	0.4278	0.2558	0.3202	0.8433
No log	5.0	100	0.4402	0.4303	0.4618	0.4455	0.8856
No log	6.0	120	0.3110	0.6084	0.6993	0.6507	0.9278
No log	7.0	140	0.2382	0.6779	0.7691	0.7206	0.9428
No log	8.0	160	0.1981	0.7346	0.7907	0.7616	0.9512
No log	9.0	180	0.1748	0.7387	0.8123	0.7737	0.9559
No log	10.0	200	0.1496	0.7432	0.8223	0.7808	0.9617
No log	11.0	220	0.1358	0.7620	0.8455	0.8016	0.9650
No log	12.0	240	0.1351	0.7637	0.8538	0.8063	0.9678
No log	13.0	260	0.1365	0.7887	0.8555	0.8207	0.9692
No log	14.0	280	0.1323	0.7460	0.8588	0.7985	0.9662
No log	15.0	300	0.1362	0.7518	0.8654	0.8046	0.9663
No log	16.0	320	0.1277	0.8	0.8704	0.8337	0.9707
No log	17.0	340	0.1319	0.7699	0.8671	0.8156	0.9698
No log	18.0	360	0.1323	0.7697	0.8605	0.8125	0.9692
No log	19.0	380	0.1383	0.7988	0.8704	0.8331	0.9708
No log	20.0	400	0.1278	0.7696	0.8654	0.8147	0.9702
No log	21.0	420	0.1437	0.7833	0.8704	0.8245	0.9687
No log	22.0	440	0.1316	0.8166	0.8804	0.8473	0.9729
No log	23.0	460	0.1369	0.7409	0.8787	0.8040	0.9668
No log	24.0	480	0.1415	0.7390	0.8654	0.7972	0.9667
0.3547	25.0	500	0.1354	0.7982	0.8870	0.8403	0.9723
0.3547	26.0	520	0.1352	0.7715	0.8804	0.8223	0.9704
0.3547	27.0	540	0.1424	0.8116	0.8804	0.8446	0.9710
0.3547	28.0	560	0.1376	0.8297	0.8821	0.8551	0.9731
0.3547	29.0	580	0.1397	0.7736	0.8854	0.8257	0.9703
0.3547	30.0	600	0.1379	0.7852	0.8804	0.8301	0.9723
0.3547	31.0	620	0.1426	0.8012	0.8837	0.8404	0.9733
0.3547	32.0	640	0.1441	0.7973	0.8821	0.8375	0.9726
0.3547	33.0	660	0.1470	0.7568	0.8837	0.8153	0.9677
0.3547	34.0	680	0.1410	0.7806	0.8804	0.8275	0.9715
0.3547	35.0	700	0.1474	0.8213	0.8854	0.8521	0.9739
0.3547	36.0	720	0.1469	0.8070	0.8821	0.8429	0.9726
0.3547	37.0	740	0.1494	0.8225	0.8854	0.8528	0.9735
0.3547	38.0	760	0.1413	0.7830	0.8870	0.8318	0.9724
0.3547	39.0	780	0.1448	0.8165	0.8870	0.8503	0.9739
0.3547	40.0	800	0.1515	0.8203	0.8870	0.8524	0.9735
0.3547	41.0	820	0.1506	0.8066	0.8870	0.8449	0.9733
0.3547	42.0	840	0.1518	0.8103	0.8870	0.8469	0.9739
0.3547	43.0	860	0.1494	0.8106	0.8887	0.8479	0.9738
0.3547	44.0	880	0.1496	0.8118	0.8887	0.8485	0.9738

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.1.1
Tokenizers 0.22.1

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32