facebook_fr

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Bleu	Rouge	Meteor	Gen Len
1.5262	1.3664	500	1.2745	20.5516	0.3884	0.3845	37.5273
1.2177	2.7327	1000	1.0978	26.2735	0.4465	0.4438	36.7556
0.9145	4.0984	1500	1.0268	28.8139	0.4707	0.4652	36.4589
0.8966	5.4648	2000	0.9816	30.3937	0.4879	0.4863	36.7879
0.8245	6.8312	2500	0.9547	31.432	0.4974	0.4931	36.741
0.7138	8.1969	3000	0.9424	32.6471	0.5076	0.5049	36.4266
0.7898	9.5632	3500	0.9284	33.0169	0.5132	0.5093	36.5511
0.7131	10.9296	4000	0.9191	33.5642	0.5204	0.5184	37.1107
0.6588	12.2953	4500	0.9207	34.1405	0.5243	0.5201	36.7248
0.6315	13.6617	5000	0.9167	34.4187	0.5233	0.5215	37.0354
0.5507	15.0273	5500	0.9201	34.658	0.5292	0.5276	36.7955
0.6227	16.3937	6000	0.9253	34.5595	0.5301	0.5316	36.9869
0.525	17.7601	6500	0.9235	34.8124	0.5362	0.5353	36.8709
0.4712	19.1258	7000	0.9339	35.4803	0.5383	0.5405	36.9185
0.5144	20.4921	7500	0.9397	35.3344	0.536	0.5385	36.9201

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(282)

this model