ac011b5c3d74515833978506edd5a189

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

Loss: 1.7477
Data Size: 1.0
Epoch Runtime: 20.9875
Bleu: 7.2494

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	3.6919	0	2.3837	0.2817
No log	1	77	3.6550	0.0078	2.6989	0.2857
No log	2	154	3.5567	0.0156	2.8912	0.2499
No log	3	231	3.4879	0.0312	3.3021	0.3033
No log	4	308	3.3999	0.0625	3.7415	0.4140
No log	5	385	3.2877	0.125	4.9392	0.7081
0.3282	6	462	3.1352	0.25	7.8107	0.8697
1.4232	7	539	2.9601	0.5	12.8925	0.9892
3.0986	8.0	616	2.7373	1.0	20.8772	1.3238
2.9416	9.0	693	2.5822	1.0	19.9890	2.1396
2.7286	10.0	770	2.4723	1.0	20.8685	2.7222
2.6411	11.0	847	2.3790	1.0	20.8805	2.8557
2.4996	12.0	924	2.3098	1.0	19.8070	3.0578
2.3682	13.0	1001	2.2397	1.0	19.8991	3.3912
2.314	14.0	1078	2.1845	1.0	21.0522	3.6963
2.2245	15.0	1155	2.1368	1.0	20.9468	4.1427
2.1665	16.0	1232	2.1020	1.0	20.3719	4.2497
2.0934	17.0	1309	2.0574	1.0	20.3863	4.6337
2.0375	18.0	1386	2.0166	1.0	21.5713	4.8467
1.959	19.0	1463	1.9812	1.0	21.4446	4.8427
1.9418	20.0	1540	1.9623	1.0	20.3362	5.0117
1.8663	21.0	1617	1.9344	1.0	20.9080	5.1227
1.8401	22.0	1694	1.9157	1.0	20.7733	5.2214
1.7756	23.0	1771	1.8923	1.0	21.6859	5.3765
1.7401	24.0	1848	1.8785	1.0	19.6916	5.5684
1.7017	25.0	1925	1.8578	1.0	21.2807	5.6905
1.6576	26.0	2002	1.8358	1.0	20.9607	5.7777
1.6186	27.0	2079	1.8285	1.0	21.4419	5.8534
1.5687	28.0	2156	1.8143	1.0	21.7423	5.8521
1.551	29.0	2233	1.8060	1.0	21.1729	6.0732
1.5108	30.0	2310	1.8009	1.0	21.8225	6.0521
1.496	31.0	2387	1.7827	1.0	21.1280	6.2218
1.4387	32.0	2464	1.7746	1.0	20.3537	6.2464
1.4301	33.0	2541	1.7809	1.0	20.7134	6.3561
1.3959	34.0	2618	1.7746	1.0	20.7294	6.5741
1.3755	35.0	2695	1.7601	1.0	21.0506	6.5561
1.3427	36.0	2772	1.7552	1.0	20.5285	6.7137
1.3187	37.0	2849	1.7441	1.0	21.9356	6.8087
1.2774	38.0	2926	1.7366	1.0	22.3031	6.8258
1.2588	39.0	3003	1.7361	1.0	21.2445	6.7728
1.2347	40.0	3080	1.7379	1.0	21.5544	6.8723
1.2156	41.0	3157	1.7426	1.0	22.1101	6.8853
1.1867	42.0	3234	1.7350	1.0	21.7368	6.9266
1.1686	43.0	3311	1.7336	1.0	21.1771	6.9280
1.1534	44.0	3388	1.7421	1.0	21.3334	7.0360
1.1272	45.0	3465	1.7443	1.0	21.3038	7.0424
1.1005	46.0	3542	1.7434	1.0	21.3993	7.0903
1.0742	47.0	3619	1.7236	1.0	20.4971	7.1741
1.0821	48.0	3696	1.7441	1.0	21.8380	7.2903
1.0375	49.0	3773	1.7433	1.0	21.4747	7.3077
1.0233	50.0	3850	1.7477	1.0	20.9875	7.2494

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/ac011b5c3d74515833978506edd5a189

Base model

google-t5/t5-base

Finetuned

(729)

this model