BABILOON
/

results

text2text-generation

Generated from Trainer

Model card Files Files and versions

results / README.md

BABILOON's picture

End of training

0c990b2 verified 7 months ago

|

history blame contribute delete

3.25 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: facebook/bart-base
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: results
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# results

	This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.6147
	- Bleu: 0.2098

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 30
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:------:\|
	\| No log \| 1.0 \| 414 \| 1.5123 \| 0.1913 \|
	\| 1.8064 \| 2.0 \| 828 \| 1.5254 \| 0.2027 \|
	\| 1.1389 \| 3.0 \| 1242 \| 1.5991 \| 0.1996 \|
	\| 0.771 \| 4.0 \| 1656 \| 1.7113 \| 0.2013 \|
	\| 0.5395 \| 5.0 \| 2070 \| 1.8532 \| 0.2015 \|
	\| 0.5395 \| 6.0 \| 2484 \| 1.9543 \| 0.1955 \|
	\| 0.3855 \| 7.0 \| 2898 \| 2.0825 \| 0.2021 \|
	\| 0.2794 \| 8.0 \| 3312 \| 2.1461 \| 0.1981 \|
	\| 0.2203 \| 9.0 \| 3726 \| 2.2048 \| 0.2049 \|
	\| 0.1833 \| 10.0 \| 4140 \| 2.2585 \| 0.1997 \|
	\| 0.154 \| 11.0 \| 4554 \| 2.2970 \| 0.2018 \|
	\| 0.154 \| 12.0 \| 4968 \| 2.3328 \| 0.2013 \|
	\| 0.1266 \| 13.0 \| 5382 \| 2.3380 \| 0.2012 \|
	\| 0.1054 \| 14.0 \| 5796 \| 2.4021 \| 0.2026 \|
	\| 0.0905 \| 15.0 \| 6210 \| 2.4106 \| 0.2005 \|
	\| 0.0776 \| 16.0 \| 6624 \| 2.4528 \| 0.1988 \|
	\| 0.0665 \| 17.0 \| 7038 \| 2.4778 \| 0.2017 \|
	\| 0.0665 \| 18.0 \| 7452 \| 2.5210 \| 0.2031 \|
	\| 0.0563 \| 19.0 \| 7866 \| 2.5157 \| 0.2029 \|
	\| 0.0487 \| 20.0 \| 8280 \| 2.5245 \| 0.1998 \|
	\| 0.0411 \| 21.0 \| 8694 \| 2.5513 \| 0.2016 \|
	\| 0.0346 \| 22.0 \| 9108 \| 2.5436 \| 0.1994 \|
	\| 0.0304 \| 23.0 \| 9522 \| 2.5845 \| 0.1996 \|
	\| 0.0304 \| 24.0 \| 9936 \| 2.5827 \| 0.2005 \|
	\| 0.0237 \| 25.0 \| 10350 \| 2.5879 \| 0.2067 \|
	\| 0.0201 \| 26.0 \| 10764 \| 2.5744 \| 0.2050 \|
	\| 0.0177 \| 27.0 \| 11178 \| 2.6031 \| 0.2091 \|
	\| 0.0151 \| 28.0 \| 11592 \| 2.5932 \| 0.2099 \|
	\| 0.0133 \| 29.0 \| 12006 \| 2.6221 \| 0.2105 \|
	\| 0.0133 \| 30.0 \| 12420 \| 2.6147 \| 0.2098 \|


	### Framework versions

	- Transformers 4.52.2
	- Pytorch 2.6.0+cu124
	- Datasets 3.6.0
	- Tokenizers 0.21.1