| | --- |
| | base_model: stabilityai/StableBeluga-13B |
| | tags: |
| | - generated_from_trainer |
| | model-index: |
| | - name: PE-13b-lora |
| | results: [] |
| | --- |
| | |
| | <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| | should probably proofread and complete it, then remove this comment. --> |
| |
|
| | # PE-13b-lora |
| |
|
| | This model is a fine-tuned version of [stabilityai/StableBeluga-13B](https://huggingface.co/stabilityai/StableBeluga-13B) on an unknown dataset. |
| | It achieves the following results on the evaluation set: |
| | - Loss: 0.5704 |
| | - Rewards/chosen: 0.1581 |
| | - Rewards/rejected: -0.1076 |
| | - Rewards/accuracies: 0.9472 |
| | - Rewards/margins: 0.2658 |
| | - Logps/rejected: -73.1769 |
| | - Logps/chosen: -90.4042 |
| | - Logits/rejected: -1.7758 |
| | - Logits/chosen: -2.0462 |
| |
|
| | ## Model description |
| |
|
| | More information needed |
| |
|
| | ## Intended uses & limitations |
| |
|
| | More information needed |
| |
|
| | ## Training and evaluation data |
| |
|
| | More information needed |
| |
|
| | ## Training procedure |
| |
|
| | ### Training hyperparameters |
| |
|
| | The following hyperparameters were used during training: |
| | - learning_rate: 5e-07 |
| | - train_batch_size: 6 |
| | - eval_batch_size: 4 |
| | - seed: 42 |
| | - distributed_type: multi-GPU |
| | - num_devices: 8 |
| | - gradient_accumulation_steps: 2 |
| | - total_train_batch_size: 96 |
| | - total_eval_batch_size: 32 |
| | - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
| | - lr_scheduler_type: linear |
| | - lr_scheduler_warmup_ratio: 0.1 |
| | - num_epochs: 1 |
| | |
| | ### Training results |
| | |
| | | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | |
| | |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:| |
| | | 0.693 | 0.07 | 100 | 0.6933 | -0.0008 | -0.0005 | 0.4889 | -0.0003 | -72.1053 | -91.9932 | -1.7861 | -2.0525 | |
| | | 0.69 | 0.14 | 200 | 0.6901 | 0.0031 | -0.0015 | 0.5611 | 0.0046 | -72.1153 | -91.9544 | -1.7859 | -2.0524 | |
| | | 0.6842 | 0.21 | 300 | 0.6832 | 0.0139 | -0.0056 | 0.6917 | 0.0195 | -72.1567 | -91.8467 | -1.7847 | -2.0513 | |
| | | 0.672 | 0.27 | 400 | 0.6718 | 0.0281 | -0.0131 | 0.8250 | 0.0412 | -72.2312 | -91.7049 | -1.7836 | -2.0504 | |
| | | 0.6563 | 0.34 | 500 | 0.6575 | 0.0498 | -0.0211 | 0.8861 | 0.0709 | -72.3116 | -91.4876 | -1.7821 | -2.0494 | |
| | | 0.6437 | 0.41 | 600 | 0.6416 | 0.0705 | -0.0340 | 0.9111 | 0.1044 | -72.4401 | -91.2810 | -1.7807 | -2.0486 | |
| | | 0.6261 | 0.48 | 700 | 0.6277 | 0.0885 | -0.0435 | 0.9250 | 0.1320 | -72.5355 | -91.1010 | -1.7796 | -2.0478 | |
| | | 0.6117 | 0.55 | 800 | 0.6127 | 0.1097 | -0.0567 | 0.9222 | 0.1664 | -72.6675 | -90.8891 | -1.7786 | -2.0474 | |
| | | 0.6002 | 0.62 | 900 | 0.6019 | 0.1226 | -0.0683 | 0.9278 | 0.1909 | -72.7836 | -90.7598 | -1.7777 | -2.0468 | |
| | | 0.5912 | 0.68 | 1000 | 0.5912 | 0.1344 | -0.0805 | 0.9333 | 0.2148 | -72.9053 | -90.6422 | -1.7770 | -2.0466 | |
| | | 0.5822 | 0.75 | 1100 | 0.5822 | 0.1441 | -0.0909 | 0.9472 | 0.2350 | -73.0092 | -90.5447 | -1.7763 | -2.0462 | |
| | | 0.5789 | 0.82 | 1200 | 0.5759 | 0.1517 | -0.0992 | 0.9333 | 0.2509 | -73.0923 | -90.4690 | -1.7763 | -2.0465 | |
| | | 0.5689 | 0.89 | 1300 | 0.5722 | 0.1555 | -0.1033 | 0.9500 | 0.2588 | -73.1332 | -90.4305 | -1.7762 | -2.0465 | |
| | | 0.5694 | 0.96 | 1400 | 0.5702 | 0.1579 | -0.1066 | 0.9417 | 0.2644 | -73.1662 | -90.4070 | -1.7761 | -2.0465 | |
| | |
| | |
| | ### Framework versions |
| | |
| | - Transformers 4.35.0 |
| | - Pytorch 2.1.1+cu121 |
| | - Datasets 2.14.6 |
| | - Tokenizers 0.14.1 |
| | |