nllb-dry-run

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.5119
Bleu: 3.4640
Chrf: 18.8077

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-05
train_batch_size: 2
eval_batch_size: 4
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
training_steps: 200

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Chrf
No log	0.3125	10	3.4697	2.3857	15.3837
No log	0.625	20	3.3382	3.7821	19.2101
No log	0.9375	30	3.3116	5.8430	18.9142
No log	1.25	40	3.2531	3.6516	19.1025
No log	1.5625	50	3.5119	3.4640	18.8077

Framework versions

Transformers 5.5.4
Pytorch 2.8.0+cu128
Datasets 4.8.4
Tokenizers 0.22.2

Downloads last month: 45

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for madoss/nllb-dry-run

Base model

facebook/nllb-200-distilled-600M

Finetuned

(279)

this model