Leonel-Maia
/

nllb_complete

text2text-generation

Generated from Trainer

Model card Files Files and versions

nllb_complete / README.md

Leonel-Maia's picture

Model save

d033c83 verified 7 months ago

|

history blame contribute delete

2.43 kB

	---
	library_name: transformers
	license: cc-by-nc-4.0
	base_model: facebook/nllb-200-distilled-600M
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: nllb_complete
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# nllb_complete

	This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5537
	- Bleu: 54.0987
	- Gen Len: 17.1547

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 32
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 5000
	- num_epochs: 24.0
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-------:\|:------:\|:---------------:\|:-------:\|:-------:\|
	\| 0.7527 \| 2.0916 \| 10000 \| 0.7154 \| 41.0087 \| 17.0123 \|
	\| 0.5274 \| 4.1832 \| 20000 \| 0.5901 \| 46.2079 \| 17.2597 \|
	\| 0.4209 \| 6.2748 \| 30000 \| 0.5502 \| 49.6827 \| 17.0787 \|
	\| 0.381 \| 8.3665 \| 40000 \| 0.5324 \| 51.0468 \| 17.1997 \|
	\| 0.3123 \| 10.4581 \| 50000 \| 0.5264 \| 52.239 \| 17.0687 \|
	\| 0.279 \| 12.5497 \| 60000 \| 0.5292 \| 52.8077 \| 17.163 \|
	\| 0.2568 \| 14.6413 \| 70000 \| 0.5320 \| 53.2148 \| 17.1983 \|
	\| 0.2234 \| 16.7329 \| 80000 \| 0.5415 \| 53.2988 \| 17.1817 \|
	\| 0.2208 \| 18.8245 \| 90000 \| 0.5455 \| 53.9008 \| 17.1253 \|
	\| 0.2179 \| 20.9162 \| 100000 \| 0.5501 \| 54.2302 \| 17.107 \|
	\| 0.2057 \| 23.0077 \| 110000 \| 0.5537 \| 54.0987 \| 17.1547 \|


	### Framework versions

	- Transformers 4.53.3
	- Pytorch 2.7.1+cu126
	- Datasets 3.6.0
	- Tokenizers 0.21.2