nlile
/

PE-13b-lora

Generated from Trainer

Model card Files Files and versions

PE-13b-lora / README.md

nlile's picture

Model save

b87cd70 over 2 years ago

|

history blame contribute delete

4.59 kB

	---
	base_model: stabilityai/StableBeluga-13B
	tags:
	- generated_from_trainer
	model-index:
	- name: PE-13b-lora
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# PE-13b-lora

	This model is a fine-tuned version of [stabilityai/StableBeluga-13B](https://huggingface.co/stabilityai/StableBeluga-13B) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5704
	- Rewards/chosen: 0.1581
	- Rewards/rejected: -0.1076
	- Rewards/accuracies: 0.9472
	- Rewards/margins: 0.2658
	- Logps/rejected: -73.1769
	- Logps/chosen: -90.4042
	- Logits/rejected: -1.7758
	- Logits/chosen: -2.0462

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-07
	- train_batch_size: 6
	- eval_batch_size: 4
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 8
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 96
	- total_eval_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rewards/chosen \| Rewards/rejected \| Rewards/accuracies \| Rewards/margins \| Logps/rejected \| Logps/chosen \| Logits/rejected \| Logits/chosen \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------------:\|:----------------:\|:------------------:\|:---------------:\|:--------------:\|:------------:\|:---------------:\|:-------------:\|
	\| 0.693 \| 0.07 \| 100 \| 0.6933 \| -0.0008 \| -0.0005 \| 0.4889 \| -0.0003 \| -72.1053 \| -91.9932 \| -1.7861 \| -2.0525 \|
	\| 0.69 \| 0.14 \| 200 \| 0.6901 \| 0.0031 \| -0.0015 \| 0.5611 \| 0.0046 \| -72.1153 \| -91.9544 \| -1.7859 \| -2.0524 \|
	\| 0.6842 \| 0.21 \| 300 \| 0.6832 \| 0.0139 \| -0.0056 \| 0.6917 \| 0.0195 \| -72.1567 \| -91.8467 \| -1.7847 \| -2.0513 \|
	\| 0.672 \| 0.27 \| 400 \| 0.6718 \| 0.0281 \| -0.0131 \| 0.8250 \| 0.0412 \| -72.2312 \| -91.7049 \| -1.7836 \| -2.0504 \|
	\| 0.6563 \| 0.34 \| 500 \| 0.6575 \| 0.0498 \| -0.0211 \| 0.8861 \| 0.0709 \| -72.3116 \| -91.4876 \| -1.7821 \| -2.0494 \|
	\| 0.6437 \| 0.41 \| 600 \| 0.6416 \| 0.0705 \| -0.0340 \| 0.9111 \| 0.1044 \| -72.4401 \| -91.2810 \| -1.7807 \| -2.0486 \|
	\| 0.6261 \| 0.48 \| 700 \| 0.6277 \| 0.0885 \| -0.0435 \| 0.9250 \| 0.1320 \| -72.5355 \| -91.1010 \| -1.7796 \| -2.0478 \|
	\| 0.6117 \| 0.55 \| 800 \| 0.6127 \| 0.1097 \| -0.0567 \| 0.9222 \| 0.1664 \| -72.6675 \| -90.8891 \| -1.7786 \| -2.0474 \|
	\| 0.6002 \| 0.62 \| 900 \| 0.6019 \| 0.1226 \| -0.0683 \| 0.9278 \| 0.1909 \| -72.7836 \| -90.7598 \| -1.7777 \| -2.0468 \|
	\| 0.5912 \| 0.68 \| 1000 \| 0.5912 \| 0.1344 \| -0.0805 \| 0.9333 \| 0.2148 \| -72.9053 \| -90.6422 \| -1.7770 \| -2.0466 \|
	\| 0.5822 \| 0.75 \| 1100 \| 0.5822 \| 0.1441 \| -0.0909 \| 0.9472 \| 0.2350 \| -73.0092 \| -90.5447 \| -1.7763 \| -2.0462 \|
	\| 0.5789 \| 0.82 \| 1200 \| 0.5759 \| 0.1517 \| -0.0992 \| 0.9333 \| 0.2509 \| -73.0923 \| -90.4690 \| -1.7763 \| -2.0465 \|
	\| 0.5689 \| 0.89 \| 1300 \| 0.5722 \| 0.1555 \| -0.1033 \| 0.9500 \| 0.2588 \| -73.1332 \| -90.4305 \| -1.7762 \| -2.0465 \|
	\| 0.5694 \| 0.96 \| 1400 \| 0.5702 \| 0.1579 \| -0.1066 \| 0.9417 \| 0.2644 \| -73.1662 \| -90.4070 \| -1.7761 \| -2.0465 \|


	### Framework versions

	- Transformers 4.35.0
	- Pytorch 2.1.1+cu121
	- Datasets 2.14.6
	- Tokenizers 0.14.1