46ba9f00953106724feeb0e6d75003ad

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [es-it] dataset. It achieves the following results on the evaluation set:

Loss: 1.7789
Data Size: 1.0
Epoch Runtime: 175.6510
Bleu: 5.1461

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	4.6879	0	12.8492	0.5889
No log	1	721	3.8721	0.0078	15.4155	0.7459
No log	2	1442	3.2753	0.0156	16.1442	0.9550
0.0694	3	2163	3.1165	0.0312	18.3636	1.1170
0.2336	4	2884	2.9709	0.0625	24.2604	1.4463
3.1802	5	3605	2.8373	0.125	31.8844	1.7175
3.0032	6	4326	2.6923	0.25	51.6836	2.2262
2.8229	7	5047	2.5315	0.5	92.9028	2.6556
2.5996	8.0	5768	2.3603	1.0	172.6614	3.3808
2.4543	9.0	6489	2.2549	1.0	165.8746	3.8830
2.3677	10.0	7210	2.1796	1.0	168.3646	3.8699
2.2622	11.0	7931	2.1211	1.0	184.5834	4.3776
2.214	12.0	8652	2.0788	1.0	168.4283	4.1449
2.1653	13.0	9373	2.0442	1.0	167.0960	4.3316
2.086	14.0	10094	2.0048	1.0	165.6252	4.5767
2.0697	15.0	10815	1.9798	1.0	169.2267	4.6422
2.0029	16.0	11536	1.9549	1.0	170.1912	4.5726
1.9743	17.0	12257	1.9370	1.0	182.2315	4.7200
1.9525	18.0	12978	1.9166	1.0	171.6727	4.7279
1.8976	19.0	13699	1.9032	1.0	171.3381	4.7782
1.889	20.0	14420	1.8803	1.0	167.8934	4.7469
1.8618	21.0	15141	1.8683	1.0	167.1339	4.7453
1.8258	22.0	15862	1.8562	1.0	173.4089	4.9908
1.7605	23.0	16583	1.8501	1.0	175.8084	4.8125
1.7613	24.0	17304	1.8394	1.0	178.0341	4.7958
1.7575	25.0	18025	1.8367	1.0	169.0567	4.8320
1.7359	26.0	18746	1.8200	1.0	168.6450	5.0095
1.7003	27.0	19467	1.8149	1.0	165.6981	5.0177
1.6815	28.0	20188	1.8135	1.0	179.2173	5.0638
1.6487	29.0	20909	1.8001	1.0	167.8968	4.9766
1.6336	30.0	21630	1.8028	1.0	186.7506	5.0590
1.6327	31.0	22351	1.7956	1.0	168.0909	5.0310
1.6095	32.0	23072	1.7902	1.0	168.5011	5.1742
1.5885	33.0	23793	1.7922	1.0	168.4223	5.0266
1.5486	34.0	24514	1.7871	1.0	164.3427	5.0639
1.5367	35.0	25235	1.7886	1.0	168.8119	5.0905
1.5183	36.0	25956	1.7823	1.0	165.7999	5.1489
1.4925	37.0	26677	1.7807	1.0	168.0827	5.2182
1.4852	38.0	27398	1.7770	1.0	169.7657	5.2135
1.4448	39.0	28119	1.7795	1.0	161.5056	5.1849
1.4603	40.0	28840	1.7791	1.0	178.0606	5.1406
1.4539	41.0	29561	1.7819	1.0	180.8409	5.0923
1.4387	42.0	30282	1.7756	1.0	194.7915	5.1573
1.3895	43.0	31003	1.7854	1.0	174.9050	5.0689
1.4083	44.0	31724	1.7847	1.0	170.9028	5.1679
1.3901	45.0	32445	1.7798	1.0	163.4660	5.1893
1.3658	46.0	33166	1.7789	1.0	175.6510	5.1461

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/46ba9f00953106724feeb0e6d75003ad

Base model

google-t5/t5-base

Finetuned

(728)

this model