baseline_0.2

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4983
Exact Match: 0.451

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 400
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
lr_scheduler_type: inverse_sqrt
lr_scheduler_warmup_steps: 4000
training_steps: 20000
label_smoothing_factor: 0.1

Training results

Training Loss	Epoch	Step	Validation Loss	Exact Match
1.0167	30.7692	400	1.6942	0.081
1.01	61.5385	800	1.6964	0.071
1.0007	92.3077	1200	1.6578	0.109
0.9878	123.0769	1600	1.6108	0.157
0.9735	153.8462	2000	1.5436	0.201
0.9548	184.6154	2400	1.5288	0.26
0.9366	215.3846	2800	1.4851	0.275
0.9174	246.1538	3200	1.5143	0.28
0.8983	276.9231	3600	1.4985	0.274
0.8818	307.6923	4000	1.4550	0.323
0.8614	338.4615	4400	1.4834	0.332
0.8408	369.2308	4800	1.4253	0.394
0.8247	400.0	5200	1.4800	0.371
0.8119	430.7692	5600	1.4821	0.394
0.8006	461.5385	6000	1.4741	0.426
0.7919	492.3077	6400	1.4651	0.434
0.7856	523.0769	6800	1.5023	0.407
0.7797	553.8462	7200	1.4724	0.435
0.7748	584.6154	7600	1.5038	0.442
0.7707	615.3846	8000	1.5089	0.424
0.7675	646.1538	8400	1.5079	0.447
0.7645	676.9231	8800	1.5561	0.415
0.7612	707.6923	9200	1.5001	0.448
0.7592	738.4615	9600	1.5018	0.42
0.757	769.2308	10000	1.4909	0.45
0.7554	800.0	10400	1.5328	0.442
0.7532	830.7692	10800	1.4890	0.435
0.752	861.5385	11200	1.5386	0.425
0.7501	892.3077	11600	1.4787	0.442
0.7491	923.0769	12000	1.5313	0.43
0.7482	953.8462	12400	1.5069	0.431
0.7467	984.6154	12800	1.4891	0.457
0.7459	1015.3846	13200	1.4972	0.433
0.7449	1046.1538	13600	1.5395	0.42
0.7442	1076.9231	14000	1.5231	0.444
0.7435	1107.6923	14400	1.5112	0.425
0.7426	1138.4615	14800	1.5193	0.434
0.742	1169.2308	15200	1.5144	0.448
0.7411	1200.0	15600	1.5226	0.421
0.7407	1230.7692	16000	1.5013	0.461
0.7398	1261.5385	16400	1.5162	0.442
0.7394	1292.3077	16800	1.5417	0.418
0.7391	1323.0769	17200	1.5341	0.44
0.7386	1353.8462	17600	1.5455	0.432
0.7382	1384.6154	18000	1.5646	0.436
0.7374	1415.3846	18400	1.5468	0.43
0.7372	1446.1538	18800	1.5248	0.446
0.7367	1476.9231	19200	1.5088	0.461
0.7363	1507.6923	19600	1.5517	0.422
0.7359	1538.4615	20000	1.5061	0.444

Framework versions

Transformers 4.44.0
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 3

Safetensors

Model size

7.36M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support