out / README.md

End of training

c0e3fa0 verified over 1 year ago

3.35 kB

	---
	license: mit
	library_name: peft
	tags:
	- generated_from_trainer
	base_model: microsoft/phi-2
	model-index:
	- name: out
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# out

	This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.9839

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- num_epochs: 20

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-------:\|:----:\|:---------------:\|
	\| 2.5886 \| 0.4889 \| 22 \| 2.6622 \|
	\| 2.6172 \| 0.9778 \| 44 \| 2.5162 \|
	\| 2.502 \| 1.4667 \| 66 \| 2.3493 \|
	\| 2.031 \| 1.9556 \| 88 \| 2.2040 \|
	\| 2.1055 \| 2.4444 \| 110 \| 2.0898 \|
	\| 2.0684 \| 2.9333 \| 132 \| 2.0072 \|
	\| 1.7919 \| 3.4222 \| 154 \| 1.9535 \|
	\| 1.608 \| 3.9111 \| 176 \| 1.9187 \|
	\| 1.7458 \| 4.4 \| 198 \| 1.9064 \|
	\| 1.7142 \| 4.8889 \| 220 \| 1.8939 \|
	\| 1.7855 \| 5.3778 \| 242 \| 1.8945 \|
	\| 1.6425 \| 5.8667 \| 264 \| 1.8936 \|
	\| 1.6711 \| 6.3556 \| 286 \| 1.8951 \|
	\| 1.6882 \| 6.8444 \| 308 \| 1.9007 \|
	\| 1.4663 \| 7.3333 \| 330 \| 1.9092 \|
	\| 1.6227 \| 7.8222 \| 352 \| 1.9037 \|
	\| 1.4768 \| 8.3111 \| 374 \| 1.9042 \|
	\| 1.5643 \| 8.8 \| 396 \| 1.9142 \|
	\| 1.4109 \| 9.2889 \| 418 \| 1.9128 \|
	\| 1.5431 \| 9.7778 \| 440 \| 1.9283 \|
	\| 1.5034 \| 10.2667 \| 462 \| 1.9184 \|
	\| 1.3418 \| 10.7556 \| 484 \| 1.9188 \|
	\| 1.5773 \| 11.2444 \| 506 \| 1.9224 \|
	\| 1.4452 \| 11.7333 \| 528 \| 1.9315 \|
	\| 1.3154 \| 12.2222 \| 550 \| 1.9313 \|
	\| 1.3509 \| 12.7111 \| 572 \| 1.9380 \|
	\| 1.3372 \| 13.2 \| 594 \| 1.9430 \|
	\| 1.3439 \| 13.6889 \| 616 \| 1.9436 \|
	\| 1.2385 \| 14.1778 \| 638 \| 1.9430 \|
	\| 1.2669 \| 14.6667 \| 660 \| 1.9453 \|
	\| 1.2923 \| 15.1556 \| 682 \| 1.9504 \|
	\| 1.1558 \| 15.6444 \| 704 \| 1.9531 \|
	\| 1.3123 \| 16.1333 \| 726 \| 1.9582 \|
	\| 1.2309 \| 16.6222 \| 748 \| 1.9588 \|
	\| 1.1934 \| 17.1111 \| 770 \| 1.9617 \|
	\| 1.1893 \| 17.6 \| 792 \| 1.9639 \|
	\| 1.1561 \| 18.0889 \| 814 \| 1.9662 \|
	\| 1.244 \| 18.5778 \| 836 \| 1.9749 \|
	\| 1.1299 \| 19.0667 \| 858 \| 1.9799 \|
	\| 1.1213 \| 19.5556 \| 880 \| 1.9839 \|


	### Framework versions

	- PEFT 0.11.1
	- Transformers 4.41.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.19.1
	- Tokenizers 0.19.1

	---
	license: mit
	library_name: peft
	tags:
	- generated_from_trainer
	base_model: microsoft/phi-2
	model-index:
	- name: out
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# out

	This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.9839

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- num_epochs: 20

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-------:\|:----:\|:---------------:\|
	\| 2.5886 \| 0.4889 \| 22 \| 2.6622 \|
	\| 2.6172 \| 0.9778 \| 44 \| 2.5162 \|
	\| 2.502 \| 1.4667 \| 66 \| 2.3493 \|
	\| 2.031 \| 1.9556 \| 88 \| 2.2040 \|
	\| 2.1055 \| 2.4444 \| 110 \| 2.0898 \|
	\| 2.0684 \| 2.9333 \| 132 \| 2.0072 \|
	\| 1.7919 \| 3.4222 \| 154 \| 1.9535 \|
	\| 1.608 \| 3.9111 \| 176 \| 1.9187 \|
	\| 1.7458 \| 4.4 \| 198 \| 1.9064 \|
	\| 1.7142 \| 4.8889 \| 220 \| 1.8939 \|
	\| 1.7855 \| 5.3778 \| 242 \| 1.8945 \|
	\| 1.6425 \| 5.8667 \| 264 \| 1.8936 \|
	\| 1.6711 \| 6.3556 \| 286 \| 1.8951 \|
	\| 1.6882 \| 6.8444 \| 308 \| 1.9007 \|
	\| 1.4663 \| 7.3333 \| 330 \| 1.9092 \|
	\| 1.6227 \| 7.8222 \| 352 \| 1.9037 \|
	\| 1.4768 \| 8.3111 \| 374 \| 1.9042 \|
	\| 1.5643 \| 8.8 \| 396 \| 1.9142 \|
	\| 1.4109 \| 9.2889 \| 418 \| 1.9128 \|
	\| 1.5431 \| 9.7778 \| 440 \| 1.9283 \|
	\| 1.5034 \| 10.2667 \| 462 \| 1.9184 \|
	\| 1.3418 \| 10.7556 \| 484 \| 1.9188 \|
	\| 1.5773 \| 11.2444 \| 506 \| 1.9224 \|
	\| 1.4452 \| 11.7333 \| 528 \| 1.9315 \|
	\| 1.3154 \| 12.2222 \| 550 \| 1.9313 \|
	\| 1.3509 \| 12.7111 \| 572 \| 1.9380 \|
	\| 1.3372 \| 13.2 \| 594 \| 1.9430 \|
	\| 1.3439 \| 13.6889 \| 616 \| 1.9436 \|
	\| 1.2385 \| 14.1778 \| 638 \| 1.9430 \|
	\| 1.2669 \| 14.6667 \| 660 \| 1.9453 \|
	\| 1.2923 \| 15.1556 \| 682 \| 1.9504 \|
	\| 1.1558 \| 15.6444 \| 704 \| 1.9531 \|
	\| 1.3123 \| 16.1333 \| 726 \| 1.9582 \|
	\| 1.2309 \| 16.6222 \| 748 \| 1.9588 \|
	\| 1.1934 \| 17.1111 \| 770 \| 1.9617 \|
	\| 1.1893 \| 17.6 \| 792 \| 1.9639 \|
	\| 1.1561 \| 18.0889 \| 814 \| 1.9662 \|
	\| 1.244 \| 18.5778 \| 836 \| 1.9749 \|
	\| 1.1299 \| 19.0667 \| 858 \| 1.9799 \|
	\| 1.1213 \| 19.5556 \| 880 \| 1.9839 \|


	### Framework versions

	- PEFT 0.11.1
	- Transformers 4.41.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.19.1
	- Tokenizers 0.19.1