kaggle-competitionV4

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 128
total_eval_batch_size: 32
optimizer: Use OptimizerNames.ADAFACTOR and the args are: No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 7
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Score	Bleu	Chrf
No log	0.5445	300	2.1560	19.4321	12.1497	31.0796
2.5006	1.0889	600	1.8783	24.8743	17.1755	36.0241
2.5006	1.6334	900	1.7428	27.3208	19.4256	38.4248
1.8822	2.1779	1200	1.6604	29.1509	21.0993	40.2751
1.6935	2.7223	1500	1.5974	29.3307	20.8970	41.1681
1.6935	3.2668	1800	1.5557	31.1230	22.9892	42.1347
1.5797	3.8113	2100	1.5209	30.8637	22.3684	42.5854
1.5797	4.3557	2400	1.4980	32.1591	23.8574	43.3495
1.5044	4.9002	2700	1.4774	32.6654	24.3790	43.7686
1.4484	5.4446	3000	1.4651	33.2768	25.0795	44.1535
1.4484	5.9891	3300	1.4541	32.7798	24.3589	44.1118
1.4135	6.5336	3600	1.4491	33.5067	25.3096	44.3587

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(318)

this model