albert-base-v2-wage-prediction

This model is a fine-tuned version of albert/albert-base-v2 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
num_epochs: 32
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Mse
No log	1.0	141	0.456951	-0.456951
No log	2.0	282	0.413671	-0.413671
No log	3.0	423	0.383021	-0.383021
3.5515	4.0	564	0.338830	-0.338830
3.5515	5.0	705	0.346138	-0.346138
3.5515	6.0	846	0.310143	-0.310143
3.5515	7.0	987	0.459460	-0.459460
0.5439	8.0	1128	0.428964	-0.428964
0.5439	9.0	1269	0.372839	-0.372839
0.5439	10.0	1410	0.302395	-0.302395
0.4319	11.0	1551	0.269923	-0.269923
0.4319	12.0	1692	0.271052	-0.271052
0.4319	13.0	1833	0.303905	-0.303905
0.4319	14.0	1974	0.306196	-0.306196
0.3160	15.0	2115	0.305747	-0.305747
0.3160	16.0	2256	0.327453	-0.327453
0.3160	17.0	2397	0.318636	-0.318636
0.1363	18.0	2538	0.325792	-0.325792
0.1363	19.0	2679	0.336608	-0.336608
0.1363	20.0	2820	0.306705	-0.306705
0.1363	21.0	2961	0.303661	-0.303661
0.0475	22.0	3102	0.308157	-0.308157
0.0475	23.0	3243	0.304932	-0.304932
0.0475	24.0	3384	0.306834	-0.306834
0.0302	25.0	3525	0.300954	-0.300954
0.0302	26.0	3666	0.301874	-0.301874
0.0302	27.0	3807	0.300533	-0.300533
0.0302	28.0	3948	0.299371	-0.299371
0.0237	29.0	4089	0.299291	-0.299291
0.0237	30.0	4230	0.299130	-0.299130
0.0237	31.0	4371	0.299318	-0.299318
0.0221	32.0	4512	0.299504	-0.299504

Safetensors

Model size

11.7M params

Tensor type

F32

Base model

Finetuned

(261)

this model