c5b12a681d1784a5d2bd19f15b6fa055

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

Loss: 1.2239
Data Size: 1.0
Epoch Runtime: 551.7313
Bleu: 13.8107

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	2.9614	0	43.3896	1.2616
No log	1	2336	2.6315	0.0078	49.1768	2.5799
0.0426	2	4672	2.4835	0.0156	55.1410	3.3543
0.0605	3	7008	2.3631	0.0312	59.2835	4.1974
2.517	4	9344	2.2438	0.0625	77.6258	5.2345
2.3771	5	11680	2.1099	0.125	109.8965	6.5155
2.1849	6	14016	1.9473	0.25	168.9280	8.3202
2.0232	7	16352	1.7793	0.5	305.9969	10.0454
1.8339	8.0	18688	1.6112	1.0	556.5045	11.5823
1.6857	9.0	21024	1.5153	1.0	654.5415	12.1292
1.5826	10.0	23360	1.4566	1.0	607.7700	12.7234
1.5364	11.0	25696	1.4112	1.0	604.4563	13.2133
1.47	12.0	28032	1.3787	1.0	569.7755	13.1228
1.4091	13.0	30368	1.3494	1.0	580.4711	13.3372
1.3823	14.0	32704	1.3264	1.0	584.1167	13.7721
1.3276	15.0	35040	1.3081	1.0	574.8867	13.9498
1.345	16.0	37376	1.2958	1.0	628.8890	13.4865
1.2603	17.0	39712	1.2765	1.0	583.2858	13.7269
1.2638	18.0	42048	1.2682	1.0	623.2504	13.8755
1.2323	19.0	44384	1.2627	1.0	562.4616	13.7426
1.1931	20.0	46720	1.2478	1.0	590.6083	13.8796
1.1645	21.0	49056	1.2396	1.0	578.0513	13.6221
1.1487	22.0	51392	1.2348	1.0	578.9635	13.7758
1.1352	23.0	53728	1.2331	1.0	595.5290	13.5993
1.0869	24.0	56064	1.2301	1.0	580.7034	13.9527
1.0943	25.0	58400	1.2242	1.0	557.8745	13.9567
1.0709	26.0	60736	1.2177	1.0	607.4727	14.3138
1.0636	27.0	63072	1.2183	1.0	577.9959	13.7990
1.0399	28.0	65408	1.2145	1.0	616.5945	14.1473
1.0205	29.0	67744	1.2165	1.0	631.6346	13.9411
1.0111	30.0	70080	1.2168	1.0	555.7352	13.7206
0.9924	31.0	72416	1.2138	1.0	584.9051	13.9443
0.9949	32.0	74752	1.2126	1.0	662.6398	13.8611
0.9613	33.0	77088	1.2123	1.0	591.4603	13.8049
0.9642	34.0	79424	1.2153	1.0	665.3925	13.8260
0.9437	35.0	81760	1.2113	1.0	565.2026	13.8388
0.9413	36.0	84096	1.2106	1.0	537.2378	13.7324
0.9205	37.0	86432	1.2100	1.0	580.5891	13.7002
0.9219	38.0	88768	1.2152	1.0	561.8277	13.9455
0.9163	39.0	91104	1.2191	1.0	528.5714	13.9284
0.8856	40.0	93440	1.2203	1.0	562.3911	13.6180
0.8833	41.0	95776	1.2239	1.0	551.7313	13.8107

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/c5b12a681d1784a5d2bd19f15b6fa055

Base model

google-t5/t5-base

Finetuned

(729)

this model