appearance

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.0131
Accuracy: 0.6805
F1 Macro: 0.6261
Precision Macro: 0.6468
Recall Macro: 0.6250
Total Tf: [279, 131, 1099, 131]

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 38
num_epochs: 25

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1 Macro	Precision Macro	Recall Macro	Total Tf
1.0882	1.0	39	1.0748	0.6122	0.4014	0.3585	0.5	[251, 159, 1071, 159]
1.0566	2.0	78	1.0612	0.6122	0.4014	0.3585	0.5	[251, 159, 1071, 159]
1.0157	3.0	117	1.0543	0.6171	0.4700	0.4692	0.5254	[253, 157, 1073, 157]
0.9566	4.0	156	1.0220	0.6585	0.5851	0.6129	0.5931	[270, 140, 1090, 140]
0.8942	5.0	195	1.0177	0.6707	0.6268	0.6288	0.6265	[275, 135, 1095, 135]
0.8334	6.0	234	1.0868	0.5902	0.5460	0.5782	0.5687	[242, 168, 1062, 168]
0.7717	7.0	273	1.0260	0.6585	0.5920	0.6165	0.5945	[270, 140, 1090, 140]
0.8031	8.0	312	1.0290	0.6585	0.5821	0.6298	0.5865	[270, 140, 1090, 140]
0.7367	9.0	351	1.0135	0.6732	0.6175	0.6326	0.6166	[276, 134, 1096, 134]
0.7453	10.0	390	1.0400	0.6439	0.5868	0.6096	0.5929	[264, 146, 1084, 146]
0.7362	11.0	429	1.0152	0.6707	0.5985	0.6256	0.6053	[275, 135, 1095, 135]
0.6926	12.0	468	1.0143	0.6805	0.6156	0.6429	0.6179	[279, 131, 1099, 131]
0.6821	13.0	507	1.0325	0.6561	0.6133	0.6199	0.6160	[269, 141, 1089, 141]
0.6613	14.0	546	1.0184	0.6683	0.5984	0.6287	0.6036	[274, 136, 1094, 136]
0.6479	15.0	585	1.0198	0.6659	0.6176	0.6272	0.6158	[273, 137, 1093, 137]
0.6612	16.0	624	1.0137	0.6780	0.6191	0.6387	0.6194	[278, 132, 1098, 132]
0.6382	17.0	663	1.0194	0.6732	0.6107	0.6364	0.6126	[276, 134, 1096, 134]
0.6392	18.0	702	1.0085	0.6805	0.6288	0.6438	0.6272	[279, 131, 1099, 131]
0.6439	19.0	741	1.0100	0.6805	0.6266	0.6446	0.6259	[279, 131, 1099, 131]
0.6198	20.0	780	1.0145	0.6780	0.6305	0.6426	0.6309	[278, 132, 1098, 132]
0.6223	21.0	819	1.0200	0.6634	0.6079	0.6229	0.6089	[272, 138, 1092, 138]
0.6238	22.0	858	1.0049	0.6829	0.6389	0.6479	0.6372	[280, 130, 1100, 130]
0.6317	23.0	897	1.0042	0.6878	0.6410	0.6539	0.6378	[282, 128, 1102, 128]
0.6089	24.0	936	1.0130	0.6829	0.6308	0.6503	0.6292	[280, 130, 1100, 130]
0.6203	25.0	975	1.0131	0.6805	0.6261	0.6468	0.6250	[279, 131, 1099, 131]

Framework versions

Transformers 4.44.2
Pytorch 2.4.1+cu121
Datasets 3.2.0
Tokenizers 0.19.1

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support